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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

1 0 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fosr 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 



3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors,, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1 786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
specific domain or truncation of the peptides encoded by SEQ ID NO: 1 -1 786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequenceof SEQIDNO:l-1786and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO: 1-1 786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DN A or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO:l -1786 and 3573- 
53 5 8 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1 992), as expressed sequence tags for 
physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l-l 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 - 1 786 and 3573-5358. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the coiTesponding 
full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO:l-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotidesof (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutical ly acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful .for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singiuar forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5*-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-Iike or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

the terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1:241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
ait. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO: 1 - 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenry-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (1 ^4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 



WO 01/53312 PCI7US00/34263 
The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment/' "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 1 00 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of die residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 

10. 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E, coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into niRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
{e.g., soluble proteins) or partially {e.g., receptors) from the cell in which they are expressed. 
20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), S5°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g. , mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO:1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:.l 787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1 787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulhvlike domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1 - 1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO:l-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO: 1-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO:l-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. BioL 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et aL, 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1 982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant, PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al. 3 Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukary otic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 
pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al„ 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

15 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-faccor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. colt. Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means {e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

15 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al., Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO: 1-1 786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5* and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-1 786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyI- 
2-thiouridine, 5-carboxymethylaminomethyluraciI, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyIuracil, 5-methoxyaminomethyl-2-tbiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyIadenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (/>., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. y 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

15 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other (Gaultier et aL (1 987) Nucleic Acids Res 1 5: 6625-664 1 ). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et aL 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et aL (1987) 

25 FEBS Lett 215: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. 
No. 4,987,071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g., Bartel et aL, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1 991) 
Anticancer Drug Des. 6: 569-84; Helene. et al (1992) Ann. NY. Acad. Set 660:27-36; and 
i 0 Maher (1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et aL (1996) BioorgMed 
5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et aL (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et aL (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et aL (1996) Nucl Acids Res 24: 

3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5 , -(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

DNA segment (Finn et aL (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3' PNA segment. See, Petersen et aL (1975) Bioorg Med Chem 

LettS\ 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et aL, 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; 
Lemaitre et aL, 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
aL, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. W09 1/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbarnylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 
1 0 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
1 5 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
20 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as R coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS ceils, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jiirkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida^ or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl- transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
PCT/US90/06436 (WO9I/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1 786 and 3573-5358 or the corresponding ftill length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO: 1787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 
65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 
10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 

15 Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 

25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 

30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

example, in generating antibodies against the native polypeptide. Thus, they may be employed 

as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a fall length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
1 0 cell/animal death or prolonged survival of the animal/cells. 

in addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO: 1787-3572 and 5359-7144. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (/.*., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavaiin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffiriity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N. J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®' 1 ) is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 

as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 

may exhibit improved properties such as activity and/or stability. Examples of moieties which 

may be fused to the polypeptide or an analog include, for example, targeting moieties which 

provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 

10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 

fused to the polypeptide include therapeutic agents which are used for treatment, for example, 

immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 

alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 

Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the terra "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo, 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g. f cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, fiiling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publication No. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
1 0 International Publication No. WO 9 1/09955 . It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA, If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Vims thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. PatentNo. 5,578,461 to Sherwinet al.; International Application No. PCT/US92/09627 
(WO93/09222) by Seldenet al.; and International Application No. PCT/US90/06436 
(W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic anirhals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 

37 



WO 01/53312 PCT/US00/34263 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 



4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
•determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques' 1 , Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.103 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E, Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-ceil effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et ah, Proc. Natl Acad. Sci. USA 77:6091 -6095, 
1 980; Weinberger et al. 3 Eur. J. Immun. 1 1 A05-4 1 1 , 1 98 1 ; Takai et aL, J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotent al or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 Iigand (Flt- 
3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 
inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem celJs are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention aione or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al. Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et aL, 
5 Froc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
aL, Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 . Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10,6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above' from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

46 



10 



WO 01/53312 PCT/US00/34263 
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft- versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The aoiniriistration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad, Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or N2B hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and encephalitis, 
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Alternatively, anti- viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

15 MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al, J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al, J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:63 1-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84: 1 1 1-1 17, 1 994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

1 0 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
aL, Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et at, Nature 

20 321 :776-779, 1986; Mason et al., Nature 3 1 8:659-663, 1 985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial ceils. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6,12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be usefiil in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposz's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutical^ 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1 987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. BioL, 40: 1 1 89-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10,12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-Iigand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 

175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-natural]y occurring) variants thereof. For a 
review, see Science 282:63-6% (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin, 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e„ increase or decrease) biological activity of a polypeptide of the invention. 

15 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4,10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), incliiding without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

15 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
20 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in v/w, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1 980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 t assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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eliminati6n of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10 .19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 



4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradennally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradennally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 



4 J 1.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Oljag/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.l^g/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF) 3 platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-cc and TGF-JJ), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents), A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 

15 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, iymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, Iymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

1 0 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician.to provide maximal therapeutic benefit 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other Iow-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such earners or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
5 herein by reference . 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
1 0 ingredient of the present invention with which to treat each individual patient. Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
1 5 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \ig to about 100 mg (preferably about 0.1 pg to about 10 mg, more preferably 
about 0.1 jig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be usefiil to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 5 o and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local- 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 jig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 fig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab> Fab* and F( a b*)2 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGj, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

15 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol BioL 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecificaJly bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, * 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinants expressed irnmimogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar imrmmostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol. . 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
1 0 enzyme-linked irrimunoabsorbent assay (ELIS A). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochemu 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,8 1 6,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13,2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab\ F(ab') 2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321 :522-525 (1986); Riechmann et al., Nature . 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. On. Struct. Riol 
2:593-596 (1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "folly human antibodies'* herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et ah, 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. MoL Biol.. 227:381 (1991); 
Marks et al., J. Mol. BioL, 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technolog v 10 r 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368 . 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. J3 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
1 0 U.S. Patent No. 5,916,771 . It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5,13.4 Fnb Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b«>2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F (ab ')2 fragment; (iii) an F ab fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

m 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et a!., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
20 al., Methods in Bnzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 
30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 

10 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and V L domains of one fragment are forced to pair with the complementary V L and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 1 52:5368 (1 994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1 991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRHI (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et aL, J. Exp Med., 176: 1 191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such imrnunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, I3I I, i3I In, 90 Y,and 186 Re. 
1 0 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithioI) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
1 5 bis-(p-dia2oniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethyIene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

1 0 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et ah, J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251 :1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), VoL 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents firom one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3 573-53 58, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251 : 1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. Sci. USA 91 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NH. Co vaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5 -end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussene/tf/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et at., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 run long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2 M l-ethyl-3-(3«dime%laminopropyl)-carbodiimide(EDC), dissolved in 
1 0 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 
described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiesterlink to aliphatic 
hydroxy! groups carried by the support. The oligonucleotide's then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodor et al (1991) Science 25 1 (4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al. (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1 994) PN AS USA 91(11) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5 ! -protectediV-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al. (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CV/JI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
moleculepUC19(2688 base pairs). Fitzgerald etal (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI** digest of pUCl 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cvi JI* * restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80~90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fasta.bioch.virginia.edu) which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VL1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 

25 UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1413. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-141 3. 

30 The nearest neighbor results for SEQ ID NO: 328-141 3 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 . examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 

15 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 53.2 EXAMPLE 5 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/orBLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

UniGene version 117, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 4 1 4- 1 652 . 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 1 18, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., "Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
15 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5,4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a fiill length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1653-1 745. 

Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhamraer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FAST Y and/or BLAST against Genbank (i.e., dbEST version 1 19, gb pri 1 1 9, 
5 UniGene version 1 1 9, Genpept release 1 1 9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc~zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1 768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 
The homology for SEQ ID NO: 1 746-1 768 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incoiporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication tc 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. I, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 20, gb pri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 769-1 786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on Octoher 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 21 9-235 (1999) herein incoiporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin 



adult brain 



RNA Source 



Hyeeq 
Mbrary Name 



SEQ ID NOS: 



GIBCO 



adult brain" 



"GIBCO" 



A B3 0 01~ 1 9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
1 140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
I 298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1S06-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 

_ 1771 

~ABD003 | 3 12-14 18-19 25 30-31 34-36 43- 

45 50-51 56 58 60 65-6S 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 
214 223 233 235-237 247 257 
261 268-269 272 276 280-281 284- 
28S 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 
508 516 519-520 525-527 534 
540 542-543 545 553 555 560 569- 
570 574-576 586-588 593 595 597 
601 606-609 616-620 622-623 625 
628-633 63S-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 753 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 332-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-2082 1085-1086 1089 



210 212- 
259 



507- 
536- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



2097 1103 1107 1103 1112 1116- 

1117 1119 1121 1124 1127 1130 

1134 1144-1145 1149 1151 1157- 

1158 1167 1170 1178 1184 1188 

1190 1193-1194 1200 1202 1215- 

1217 1220 1226-1227 1229 1231 

1241 1243 1247 1252 1258 



1267 1269 1279 1281 1284 



1263 
1286- 



1289 1293-1294 1306-1307 1312 
1316-1320 1326 1333 1338 1341 
1344 1348 1351 1355-1357 1368 
1374 1377 1380 1386 1389-1390 
1394 1400 1409 1414 1422-1423 
1425-1427 1437 1443 144S 1454 
1456 1458-1459 1468 1470-1472 
1478 1482-1483 1487-1488 1493 
1497 1499 1506 1508-1511 1517 
1522-1524 1530-1533 1545-1546 
1548-1550 1552 1557-1559-1563 



1565 1567 1569 1571 1S86 
1591 1593 1595 1598-1601 



1588 
1608 



adult brain 



Clontech 



1611 1620-1621 1624-1626 1628 
1630-1632 1636 1640-1641 1644- 
1645 1647 1649 1653-1655 1657 
1664 1667 1669 1673 1678-1681 
1686 1690 1694-1696 1701 1709 
1711 1719 1722-1723 1726-1727 
1731-1733 1738 1740 1743-1744 
1747 1749 1753 1757-1758 1760- 
1761 1765 1771 1785 



ABR001 



adult brain 



adult brain 



Clontech 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 4S4 469 481 490 
506 517 586 597 631 641 659 691 
715 799 003 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



ABR006 



Clontech 



ABR00 8 



5-8 15-16 168 212-213 271 278 
280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 16S3 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1751 



10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-53 60-66 
60-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



S2Q ID NOS: 



208 210 214-215 218 221-226 229"" 
231-232 234-241 24S-247 251-253 
255 257-259 268-269 271 276-281 
285-286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320* 
322 325-326 328 330-331 333-338 
341 344-347 349 352 354 356-357 
362 369-373 376 379-380 382 384 
387 390-391 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 453-435 462 464 
467 469-471 476 478 482-484 488- 
491 497 S03 506-513 516-517 520 
524-526 528-530 532-534 S37-540 
542 544 547-551 553 561 565-567 
572-574 577 581 585 587-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 645-647 
651-653 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 756 758-759 
762-764 766 768 773-778 780-782 
784-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 842-843 856-859 
861-862 865 867-872 874-875 881 
883-884.887 889-892 894-B95 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 1057 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 1160- 
1163 1167 1169 1172 1175 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 1305-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1586-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1669 1671- 
1674 1676-1684 1686 16B9-1690 
1694-1696 1704-1705 17D8-1709 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adul t bra in 



Clontech 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 17B6 



ABR011 



adult brain 
adult brain 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



BioChain 



ABR012 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



Invitrogen 



ABR013 



adult brain 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



Invitrogen 



ABR014 



adult brain 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



Invitrogen 



ABR015 



adult brain 
adult brain 



419" 434-435 441-442 763 789 983 
1320 



Invitrogen 



ABR016 



Invitrogen 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



ABT00 4 



14-16 22-23 25 37-39 43 58 60 

70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 158 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 S16 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 789 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
961 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 1258 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1536 1547 
1554 1557-1559 1551-1562 1567 
1585 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



cultured 
preadipocytes 



Strategene 



ADP001 



5-8 11 17 25 68-69 
105 110 116 136-138 
189 196-198 261 267 
301 318 331 336-338 
400 428 430-431 510 
527 549 557 561 602 
631 637 647 670 681 
748 782 793-794 817 
845 858-859 879 882 
960 982 986 995-996 
1005-1007 1025 1027 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
168 171 188- 
276 288 293 
379-380 391 
-512 520 524 
618 620 622 
682 710 731 
834-836 843 
893-895 934 
1000 1002 
1028 1032 
1097 1099- 
1219-1220 
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Tissue Orig in 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1260 

1322 

1370- 

1437 

1602 

1660 

1711 

1760- 



adrenal gland 



1271 

1329 

1371 

1466 

1608 

1662 

1719- 

1761 



1297- 

1339 

13 98 

1468 

1614 

1673 

1720 

1765 



1298 

1345 

1408 

1533 

1631 

1687- 

1742 

1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



1320 
1366 
1431 
1594 
1650 
1696 
1749 
1785 



Clontech 



ADR002 



adult heart 



"GIBCO 



4-10 15-16 25 29-31 43-4$ 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 2S1 255 267-269 271 280- 
281 285 295 298 311 336-338* 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
869 875 883 898 904 912 922-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-1135 1151 1158 1163 1175 
1181 1188 1209 1218 X224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1546 1S67 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



AHR001 



4-8 10-11 15-16 
46 50-52 57-58 
85 87 89 94 97 
110 112 114 116 
127 130-132 134 
147-151 153 163 
186 192 195 197 
215 220 225-226 
236 251 257-260 
277 280-282 285 
298-301 304 307 
325 330 333 336 
352 354 358 361 
384 387-338 391 
408-409 411-412 
433-439 445-446 
457 459 462 469 
483-4B4 487-490 
503 506 508 510- 
526 534 536-540 
560-562 574-577 
587 589 593 595 
612 615-620 622- 
645-652 656-660 
674-675 683-684 
701 709 712 715- 



18-21 3 
60 62-63 
100 103 
118-119 
136-138 
-164 168 
199 204 
229-230 
262 265 
-286 289 
309 314 
338 345 
368 370 
393 397 
414-416 
449 452 
472-473 
492-493 
513 516 
542 546 
581-582 
597 604 
623 626 
665-666 
687 692- 
716 719- 



4-39 44- 

71 75 82 
104 108- 
122-123 
141-144 
-171 179 
-205 212- 
232 234- 
272 274 
292 296 
321 324- 
349 351- 
380 383- 
401 406 
430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
609 611- 
632 637 
670-672 
694 697 
720 725- 
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Tissue origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult kidney 



726 728 730-732 735 738-739 743- 
744 746 751 753 759 761 765 770- 
771 775-780 785 788-790 796 802 
804 810 812 817 821 826 828 830 
637 843 845-847 849-853 857-861 
863-864 869 871 875 877-879 881 
883 887 890-892 894-895 897-898 
901 903 906-907 911-913 915 919 
921-925 927-928 933-935 945 958 
961-963 967 969-972 975 977-978 
980-986 990 992 999-1002 1005- 
1007 1010 1016 1019-1020 1022- • 
1023 1025 1028-1037 1039-1040 
1043 1047 1050 1054-1055 1057 
1059 1063-1064 1067-1068 1070 
1072 1075-1076 1083 1085-1087 
1089 1093-1094 1104 1106 HOB- 
1109 1113 1116-1117 1119 1121 
1124 1126 1128 1131-1134 1144- 
1145 1148-1149 1151 1158 1167 
1169-1170 1175 1177 1192 1196 
1199^1200 1202 1206-1208 1211 
1216 1218 1222 1227-1229 1232- 
1235 1238-1241 1243-1244 1247- 
1248 1250 1253-1254 1256-1258 
1261 1268 1270-1271 1277 1280- 
1282 1287 1292 1298-1299 1306 
1308 1317-1321 1324-1325 1330 
1332 1334-1337 1339 1344-1345 
1349-1350 1354-1356 1359-1360 
1365-1366 1369 1371 1374-1375 
1378-1380 1383-1384 1389 1397 
1400 1403 1409 1417 1423-1426 
1437 1439 1442 1444 1446-1447 
1450 1453 1468 1470 1473 1479 
1481 1488 1490 1501-1504 1519 
1521 1524 1528 1530-1534 1536- 
1537 1539 1541-1542 1547 1553 
1555 1560 156S 1567-1571 1588 
1591 1597-1598 1601-1602 1605 
1614-1616 1619-1620 1623-1628 
1630-1632 1634 1636 1641 1644- 
1645 1647 1649 1652-1655 1659 
1662 16G7 1673-1674 1680-1681 
1684 1686-1688 1704-1705 1709 
1711-1712 1717 1724 1726-1727 
1731-1733 1737-1738 1741 1743- 
1744 1749 1754-1755 1760-1761 
1765 1772 1785 



GIBCO 



AKD001 



-8 10-11 17-21 29-31 35 1 39 42- 
45 50-51 56-58 60-61 64 68-69 75 
77 80 82 85 87 92-94 97 100 102- 
104 107-108 112 116-117 119 123 
127-133 136-137 139-141 143-144 
147-154 157 161-163 16S-166 169 
172 176 178-179 192 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 265-269 271- 
272 274 276-277 279-281 234-286 
290 293 295 298-299 301-302 304 
307 311-313 321 325-326 329-331 
333 341 344 348-350 352 356 3S8- 
359 362 364-365 36B 370-372 374 
376-377 380-382 392 395 398 400- 
401 404 407-409 414-415 423-424 
430-437 443-444 446 449 451 453- 
455 459 461-462 464 467 469 471- 
474 476-477 480-481 483 487-488 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Iiibrary Name 




















490- 


491 4 


93 49 


7-505 


510- 


513 516- 








520 


522 524 526-529 


534 


537-540 








544 


547 549 554-556 


560 


562 564 








567 


571-576 578 582 


586- 


589 592- 








593 


598-599 601 604 


-606 


6C8-613 








615- 


619 6 


21-62 


6 632 


-634 


637-643 








645- 


652 6 


55 66 


0-664 


669- 


672 676 








678- 


679 688 692-695 


698 


702 711 








713 


717 719-72 


0 727 


731 


735-736 








738 


743 745-74 


6 751 


753 


755 762- 








7S3 


765 7 


71-773 775 


-778 


780 786 








788 


793 7 


95-79 


6 800 


803 


805 808 








810- 


812 8 


14-81 


9 821 


826 


829 832 








834- 


838 842-645 848 


-855 


857-861 








864- 


865 8 


67 869 871 


874 


876-883 








836- 


887 8 


89-891 893 


-896 


898-900 








902 


906-908 910-914 


918 


920 922 








925- 


927 929-93 


5 937 


940- 


942 945 








948- 


949 951 953-958 


960- 


961 963- 








964 


969-9 


70 972 976 


-978 


982-986 








908- 


990 992-993 995 


-997 


999-1002 








1004 


-1008 


1010 


1012 


-1013 


1016- 








1017 


1019 


-1020 


1022 


1025 


-103 1 








1035 


1038 


-1040 


1042 


1044 


1047 








1050 


1054 


-1055 


1057 


-1064 


1066 








1070 


-1073 


1078 


1085 


-1086 


1088- 








1089 


1092 


1094 


1097 


1099 


-1102 








1107 


1109 


-1112 


1116 


■ 1119 


1121 








1123 


-1125 


1132 


-1135 


1140 


1142- 








1143 


1146 


-1147 


1149 


-1150 


1153- 








1154 


1157 


1159 


1163 


1167 


1170 








1178 


-1179 


1181 


1183 


1192 


11 96- 








1200 


1202 


-1204 


1206 


-1211 


1216- 








1219 


1221 


-1222 


1225 


1227 


-123 0 








1232 


-1234 


123 8 


-1241 


1243 


-1244 








1246 


-1247 


1253 


1257 


-1258 


1260- 








1261 


1267 


-1268 


1270 


1272 


-1274 








1281 


1283 


1287-1289 


1293 


-1295 








1299 


1306 


1308 


1311- 


-1313 


1317- 








1320 


1323 


1329-1330 


1334 


-133 5 








1339 


1341 


1349- 


-1350 


1353 


-1357 








1359 


1367 


1369 


1373 


1375 


1378- 








1379 


1394 


1397 


1400 


1403 


1405 








1407 


-1409 


1417 


1419 


1423 


-1424 








1428 


-1431 


1433 


1437- 


-1438 


1442- 








1443 


1445- 


-1446 


1448- 


-1450 


1453- 








1454 


1459 


1461 


1465- 


-1468 


1474- 








14 75 


1478 


1484- 


-1488 


1490 


1492- 








1493 


1495 


1497-1498 


1506 


-1507 








1509 


1512 


1518 


1521- 


•1522 


1525 








1527 


-1528 


1532- 


-1533 


1537 


1540- 








1541 


1547-1550 


1552 


1556 


-1559 








1561 


1565- 


•1566 


1568 


1571 


1575 








1578-1579 


1583 


1586-1587 


1589 








1591- 


1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618- 


1622 


1624- 


1628 


1631- 








1632 


1634- 


1636 


1638- 


1639 


1641 








1644 


1646- 


1649 


1653- 


1656 


1662 








1664 


1666- 


1667 


1670- 


1671 


1676- 








1679 


1683- 


1684 


1686 


1691- 


-1692 








1696-1699 


1701 


1709- 


1711 


1713- 








1714 


1716- 


1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748- 


1749 


1751 


1760-1761 








1753-1768 


1778 


1780 


1785 




adult kidney 


Invitrogen 


AKT002 


20-21 37-3 


9 47 


52 57 


60 65-66 








68-69 80 104 107-108 


122 


130 133 








136-137 140 142 


-143 


149 169 174 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult lung 



181 197 227-228 235-236 244 251 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 387 392 401 414 416 421 430 
443 445 449 453-454 472 497-488 
504 506 513 516 519 522 528 536- 
540 546 554 585 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
838 849-850 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
12S6 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1369 1378-1379 1403 1414 
1419 1428-1429 1436 1446 1458 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1567 1669- 
1670 1673 1686 1709 1727 1740 
1776 



GIBCO 



ALG001 



4-8 14 37-39 44-46 
63 75 82 88 93 103 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 S27 537-540 544 
564 583 607 616-617 
634 64S-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 1606 
1627-1629 1632 1642 
1569 1676-1677 1684 
1731 1732 1737-1738 
1786 



50-51 56 62- 
104 113 125 

154 157 162 
-191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
-477 488 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
B37-838 845 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 



1241 
1306 



1272- 
1320 



1379 1383- 
1434 1436 
1509 1522 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1748-1749 



lymph node 



Clontech 



ALN001 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481- 
482 503 526 529 537-540 546-547 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



young liver 



621 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



GIBCO 



ALV001 



adult liver" 



invitrogen 



ALV002 



5-8 11 20-21 46 50-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-530 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-114S 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1565 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1755 1769 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 31$ 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 702 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Origin 



adult liver 
adult ovary 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1550 1567 1578 1581 1583 1594 

1597 1601-1602 1611-1612 1615 

1618-1619 1621 1625 1637 1645 

1647 1652 1654-1655 1660 16S6 . 

1669-1671 1684 1706 1722 1737- 

1738 1742-1744 1760-1761 1753- 
1765 1772 1774 



Clontech 



ALV003 



29 676 997 1063 1119 1536 1766 



invitrogen 



AOV001 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-178 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-S47 549 552 
554-555 S61-564 S66-567 569-570 
572-573 575-576 579 581 503 585- 
588 590-591 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 788 790-791 794-796 * 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-8B3 
887-888 890-695 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1191 1X95 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1299 

1323 

1338- 

1359 

1377- 

1394 

1427 

1443 

1463- 

1481 

1494 

1507 

1526- 

1538- 

1553 

1567 

1578 

1591 

1609 

1636 

1657 

1671 

1690 

1713- 

1726- 

1738 

1751 

1765 

1778- 



1306 

1327 

1339 

1361 

1375 

1400 

1429- 

1445- 

14 64 

1484- 

1496- 

1511- 

1527 

1539 

1555- 

1569- 

1580- 

1595 

1611* 

1638 

1659- 

1673- 

1699 

1714 

1728 

1740- 

1753 

1767- 

1779 



1308 

1329- 

1341 

1365- 

1383- 

1404 

1431 

1450 

1466 

1485 

1498 

1517 

1530- 

1541 

1559 

1570 

1581 

1597- 

1621 

1641 

1662 

1674 

1702- 

1715- 

1731- 

1741 

1755- 

1768 

1783- 



1312 
1330 
1343 
1366 
1384 
1416 
1435 
1453 
1468 
1488 
1501 
1519 
1531 
1546 
1561 
1572 
1587 
1598 
1623 
1643 
1664 
1676 
1707 
1719 
1733 
1743 
1756 
1770 
1784 



1317- 

1332- 

1351 

1371- 

1386 

•1417 

-1436 

•1454 
1470 
1491 

■1504 
1521- 
1534- 
1548- 

•1563 
1574- 

•1588 
1600- 

■1630 
1645 
1667 

•1681 
1710- 
1723- 
1735 

■1744 
1760- 

•1771 
1786 



1321 

1333 

1356 

1375 

1389 

1422- 

1439- 

1459 

1474- 

1493- 

1506- 

1524 

1536 

1550 

1566- 

1575 

1590- 

1606 

1634 

1647- 

1669- 

1683- 

1711 

1724 

1737- 

1748- 

1762 

1776 



adult placenta 



CI on tech 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



APL001 



placenta 



Invitrogen 



APL002 



14-16 26 29 43 60-61 79-80 103 
106 116 135 171 177 180 194 196 
198 210 216 235-236 272 290 299 
309 329 334 339 359 379-380 417 
423 430 434-435 448 454 483 490- 
491 517 522 631 723 725-726 728 
738 746 769 818 843 854-855 857- 
858 916 948 953-954 976 988-989 
1005-1006 1013 1033 1036 1064 
1068 1070 1086 1139 1144-1145 
1160 1277 1285 1317-1320 1343 
1345 1429 1435 1438 1454 1482 
1486 1490 1512 1519 1532 1549 
1592-1593 1602 1626 1647 1649 
1664 1673 1675 1722 1727 1730 
1746 1776 



adult" spleen 



GIBCO 



ASP001 



3 5-8 12 15- 
44-45 57 60 
103 106 108 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 
554 557 
611-612 620- 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-8S3 858 



519 
574 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434 
481 490-493 
530 534 536- 
S76 582 592 
621 623 631- 
667 671 673- 
730 732 738 
780 788- 



774 
822 



930 832 



862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 
436 446 
500 503 
540 547 
595 604 
632 642 
675 684 
742-744 
789 794 
845 848 
879 882 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 921-923 926- 
927 934 942 949 957-9S8 963 977- 



978 983 990 992-994 


996-997 999 


1005- 


1007 


1010 


1012 


1031 


1036 


1042- 


1044 


1046 


1049 


1059 


1068 


1070 


1076 


1085-1090 


1094 


1103 


1109 


1113 


1115 


1124 


1140 


1163 


1170 


1174 


1177 


1190 


1196 


1219- 


1220 


1226- 


•1227 


1229 


1236 


1241 


1246 


1258 


1269 


1271 


1274 


1295 


1301 


1320 


1322 


1330 


1334-1335 


1339 


1349 


1351 


1353 


1359- 


1360 


1364 


1369 


1374 


1386 


1397 


1413 


1417 


1434 


1436- 


1437 


1439 


1468 


14 74 


1477 


1480 


1485-1487 


1498 


1512 


1522 


1525 


1544 


-1549 


1553 


1560 


1567 


1591 


1600 


1631 


1636 


1651 


1654- 


-1655 


1658 


1662 


1670 


1674 


1678-1679 


1684 


1686 


1700 


1727 


1733 


1738 


1740 


-1741 


1760- 


1761 


1774 


1779 


1781 


-1782 





testis 



GIB CO 



ATS001 



5-8 10 26 30-31 47 
69 82 84-85 97 102 
139 150 152 154 156 
176-177 192 194 196 
227-228 247 255 258 
288-289 301 307 311 
349 370-372 392 398 
427 430-431 433 437 
469 473 477 481-482 
503 S13 522 526 547 
564 572-573 575-576 
599-602 605 612 615 
637 647 649-650 656 
674-675 712 719-721 
738 744 746 773 780 
802 804 809 811 814 
843 845 848 859 866 
913 916 919 921 926 
960 963 971 975 977 
993 1007 1016 1029 
1035 1038-1039 1045 
1064 1070 1072-1073 
1097 1099-1102 1104 
1141 1149 1161-1162 
1209 1222 1227 1229 
1238-1239 1243 1253 
1289 1291-1293 1307 
1320 1330 1332 1338 
1373-1374 1379 1389 
1409 1423-1424 1430 
1443 1459 1484 1486 
1496-1497 1501 150S 
1527 1530-1531 1533 
1549 1563 1565 1567 
1577 1586 1591 1599 
1628 1630-1632 1636 
1649 1661-1662 1666 
1675' 1684 1690 1699 
1717 1724 1730 1737 
1767 1779 
686 1352 1412 



50-51 57 58- 
113 119 137 

163 169 174 
-197 212-215 
261 282 285 
316 330 334 
410 415 426- 
446 454 461 
493 499 502- 
552-553 563- 
581-582 585 
-617 620 631 
660 665 670 
723 728 731 
784 78.8-789 
826 831 837 
869 877 905 
929 937 950 
9B1 990 992- 
1030 1034- 
1059-1060 
1087 1089 
1108 1113 
1175 1208- 
1231 1235 
1285 1287- 
1311 1317- 
1345 1369 
1399-1400 
1435-1437 
1490 1493 
1509-1513 
1537 1546 
1569 1571 
1602 1625 
1639 1642 
1667 1670 
1705 1712 
-1738 1752 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 
Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



Genomic DNA 
from BAC 39316 



BAC002 



1411-1412 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Genomic DNA 
from BAC 39316 



adult bladder 



Research 
Genetics 

{CITB BAC 
Library) 



BAC003 



1352 



Inv.it rogen 



BLD001 



bone marrow 



S-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 3S3 382 
413 415 424 430 443 483 502 542- 
543 562 S64 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 783 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



Clontech 



BMD001 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 17B-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
23S-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 S58-B59 866-867 869 878-880 
883 890-892 896 9C3 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-9S3 955-959 963 969 973 
976 981 985 987 990 992 935 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-113S 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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Tissue Origin 



RWA Source 



Hyseq 
Library Name 



SEQ ID WOS: 



1S06 


1509 


1513 


1521 


-1522 


1524 


1526 


152B 


1531 


153 6 


-1537 


1543 


1546 


1548-1549 


1552 


1554- 


-1555 


1S57-1559 


1571- 


-1572 


1581 


1589- 


1592 


1597-1600 


1609 


1614 


1621 


1626- 


-1628 


1630-1632 


1634 


1636 


1638-1639 


1641 


1646 


-1647 


1651 


1653- 


-1655 


1661- 


-1662 


1676- 


■1681 


1684 


1686 


1690 


1702 


1707 


1711 


1713- 


-1714 


1717 


1720 


1722- 


-1723 


1727 


1737- 


1738 


1740 


1758 


1767 


1772 


1781- 


-1782 


1785- 


-1786 





bone marrow 



Clontech 



BMD002 



520 523 
569-570 581 583 
601 616-617 621 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 266 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
46S-466 472 475 478 491 493 516 
525 531 545 548 552 566 
590-591 597-598 
641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 830 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-1357 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



bone marrow 
bone marrow 



Clontech 
Clontech 



BMD004 



73-74 503 922 1036 1711 



BMD007 



95-96 866 1320 1475 



adult colon 



Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 22S-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 401 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



Mixture of 16 
tissues - 
mRNAs 



Mixture" of 16 
tissues - 
mRNAs* 
adult cervix 



RNA Source 



Various 
Vendors 



Hyseq 
Library Name 



CTL016 



SEQ ID NOS: 



1462-1464 1512 1556 1*83 1587" 

1594 1596 1614 1625-1626 1631 

1639 1645 1650 1675-1677 1687- 

1688 1701 1713-1714 1724 1740 
1765 



401 1490 1686 



Various 
Vendors 



"CTL621 



BioChaxn 



312 7B2 1132-1133 1403 1712 1715 



CVX001 



1 4-8 11 13 18-21 25 
37-39 43 46-47 58 61 
73-74 82 85 94 100 1 
118 122 126 130 134 
156 163 170 179 181 
1S6 198 201-202 218- 
231 257 266 276-277 
298 301-302 304 307 
326 329-330 332 335 
362 371-372 376 379 
388 398 400 410 414 
426-427 430-431 433 
448 461-462 454 471 
483 491 493 496 503 
516-517 526 530 535 
547 S57 S61 572-573 
582 585-S86 588-589 
602 604-605 607-609 
623 644 650 654 657- 
670 672 680 683 691- 
708-709 711 713 720- 
731-732 737 745-747 
765 771 774-777 780 
798 800 803 805 818 
832 834-836 843 847- 
857-860 864-866 869 
880 882 887 890-891 
905-908 912-913 916 
927 932 934-938 944 
958 963-964 967 969 
978-979 983 985 990 
1005-1007 1016-1017 
1033 1036 1038 1045 
1056 1066-1067 1071 
1079 1082 1098 1113 
1134 1139 1146-1149 
1170 1173 1175 1177 
1200 1202 1211 1214 
1222 1225 1227 1232- 
1241 1243 1258 1264- 
1270 1279 1287-1290 
1311 1316 1320 1323 
1349 1353-1354 1360 
1383-1384 1386 1394 



26 30-31 33 
64-66 71 
03-104 113 
140 147 153- 
186 192 195- 
219 222 229- 
285-286 288 
312-314 324 
342 352 358 
381-382 384 
416 419-420 
-436 439 446 
-477 479 482- 
506 510-513 
542-544 546- 
575-577 581- 
593-594 600 
612 615-619 
658 662-665 
694 698 706 
721 727 729 
753-754 760 
790 793 796 
826 828 831- 
848 851-855 
B71 876 878- 
897 899-902 
918-919 922 
948 955-956 
-970 972 976 
992 1000 
1024 1027 
1047 1053- 
1073 1075 
1124 1129 
1163 1167 
1181 1197 
1216 1221- 
1234 1240- 
1265 1268 
1308 1310- 
1327 1345 
1372-1374 
1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as follows: I) Normal adult brain 
mRNA (Invjtrogen), 2) normal aduJt kidney mRNA (Invitrogen), 3) normal adulr liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus 

™. (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioCham), 16) human conception^ umbilical cord mRNA (BioChain). 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



diaphragm 



BioChain 



DIA0 02" 



2HdotheLial 
cells 



1406 1416 1425-1427 1431 1436- 
1437 1442 1446 1448 1453 1459 
1466 1472 1478 1482 1456 1501- 
1503 1506 1512 1522 1527-1528 
1531 1533 1541 1547 1569 1571 
1585 1589 1597-1598 1600 1608- 
1609 1614-1616 1620 1623-1624 
1626-1628 1630 1638 1641 1643 
1649 1653 1656 1662 1667 1669 
1674-1675 1683 1685-1688 1699 
1702 1709-1710 1715 1717 1722 
1724 1729 1731-1732 1735-1739 
1741 1743-1744 1748-1749 1755 
1760-1762 1767 1773 1778 1785- 
1786 



137 2B2 289 730 780 
1478 1599 16X4 



986 1409 



Strategene 



EDI 0 0 1 T 3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 108 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 152-153 
161-163 166-172 176-179 187 190 
192 194 196-201 204-207 210 212- 
214 220 224 229-230 233 235-236 
240-241 251-252 258 261-262 265 
267-269 272 276-277 279-281 284- 
285 288 290 29S-296 301-302 310- 
311 313 316 321 325 329 331-333 
335 340-342 351-355 360 371 375 
380-382 364 387 390 392 397 400 
407-408 410 412 414 416 425-427 
431 434-436 439 444-445 4^9 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
519 522 524 526-528 532-534 536- 
540 S42-546 548 561-563 566-567 
572-576 579 581 585-586 589 593 
595 597 599 603 607-612 615-617 
620 622 626 630 632-634 638-641 
644 647 656-660 662-664 670 673 
678 680-682 692-697 707 709-710 
712-713 719 730 732 734 736 738 
743-746 751 759 768 771 773 775- 
778- 783 786-789 793 800 803 805- 
807 810-811 814 816-818 821-822 
824 826 828-829 832 834-838 842- 
845 848-850 854-860 862 864 869 
871 874 876-879 883 885 887 890- 
891 894-895 898-900 903 908 910- 
913 916 919-922 924 926-928 930- 
935 939 943 948-949 951-954 957 
959-961 964 969-970 973 £75-978 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 1205-1207 1211 
1216-1217 1219 1221 1225 1229 
1232-1235 1238-1241 1243-1244 
1246 ^SO 1253 1257-1258 1261 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SSQ ID NOS: 



1265- 


1266 


1268 


1270-1271 


1274- 


1277 


1280-1283 


1285-1286 


1288- 


1290 


1293 


1295 


1298 


1308 


13 12 


1317-1320 


1324- 


•1325 


1327 


1329- 


1330 


1334- 


-1335 


13 3 8 


1342 


-1343 


1345-1347 


1350 


1355- 


•1356 


1359 


1367 


1369 


1374 


1376 


1379 


1398 


1400 


1406 


1408 


1414 


1417 


1419 


1424- 


1426 


1428- 


-1431 


1434 


-1438 


1440- 


1442 


1448 


1450 


1462 


-1466 


1468 


1472 


14 74 


1478 


1487 


-148Q 


1491- 


•1493 


1501- 


-1504 


1506 


1509 


1511 


1516 


1520- 


-1521 


1526 




1531 


1536- 


-1537 


1539- 


•1540 


1546- 


1547 


1549 


1552 


1555 


1557 


-1559 


1561- 


•1565 


1568 


1571 


1575 


1578- 


1579 


1581- 


■1583 


1587- 


•1588 


1590 


1S92 


1597 


1605- 


-1606 


1611 


1613 


1615 


1618- 


-1621 


1624- 


■1628 


1630- 


1631 


1634 


1636 


1638 


1641 


1643- 


1650 


1652- 


-1659 


1664 


1666 


-1667 


1669 


1671 


1675-1681 


1683 


-1688 


1696- 


•1698 


1703 


1711 


1715 


-1716 


1719 


1722-1723 


1726 


1731 


-1733 


1736 


1739- 


-1741 


1743- 


-1744 


1749 


1755 


1760 


-1761 


1765 


1767 


-1768 


1771- 


-1773 


1776 


1779 


1783 


-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



286 686 1297 1303-1304 1352 
1411-1412 1754 



131-132 261 289 380 503 660 892 
1000 1007 1397 



esophagus 



BioChain 



ESO002 



fecal brain 



CI on tech 



FBR001 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



fetal brain 



Clontech 



FBR004 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 

1547 1593 

5-9 25 43 60 62-63 65-66 70 72 
80 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 547 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 638-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-627 699 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
806-807 810 817-818 826 839 843 
858 861 864 871-872 884 890-891 
894-895 898 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 



fetal brain 



Clontech 



FBR006 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



999 1001 1005-1006 1008 1013 
1016 1022 1024 1029-1030 1032 
1035 1042 1047-1048 1052 1056 
1065 1067 1070 1082 1089 1109 
1114-1115 1119 1131 1143-1149 
1151 1153-1156 1160 1163 1167 



1172-13 73 1178 1184 
1190-1200 1211 1216 



1186 1188 
1222-1223 



1226-1227 1229 1231 1236 124S 

1253-1255 1258 1260 1262 1266 

1270-1273 1281 1287 1308-1309 

1314 1317-1320 1326 1334-1335 

1339 1341 1344 1350 1356 1369- 

1371 1373 1376 1379 1381-1382 

1386 1392 1396-1398 1419 1423 

1425-1426 1428-1429 1432 1437 

1440-1441 1448 1466 1470 1482 

1502-1503 1507 1511 1513 1516 
1519 1536 1544 1549-1550 1557- 

1559 1573 1S89-1590 1598 1608 

1611-1614 1619 1621 1625-1626 

1640 1651 1657-1658 1676-1679 

1693 1696 1703-1704 1713-1714 

1718 1720 1722 1724 1726 1728 

1730-1733 1735-1736 1738-1739 
1742 1745 175S 1759-1761 1765 

1767 1771-1772 1777 1779-1780 
1786 



1188 1587 " 
34 43 61-63 
-108 128 130 
171 174 181 
208 223 230 
268-269 280- 
311 321 329 
357-359 381- 
430 434-435 
466 483 490 
527 557 561- 
595 597 623 
669-670 672 
710 717 736 
814-815 825 
855 857-858 
935-937 946 
966 969-970 
05-1007 1012 
1052 1055 
1082 1085 
1120*1128 
1149 1156- 
1204-1205 
1262 1271 
1286 1294 
1330 1342 
1355-1356 
1383-1384 
1S19 1532 
1567 1578 
1601 1608 
1644 1661 
1688 1690 
1753 1757 
1774 1776 



fetal brain 



Clontech 



FBRS03 



235-236 520 864 106T 
15-18 20-21 24-25 29 
77-78 98 101 103 107 
136 146 148 165-166 
185 196-198 204-205 
235-236 251 253 251 
281 284-285 288 309 
334 339 346-347 350 
383 390 407 418-419 
438 443-444 461 464 
494 509 516 519 522 
562 572-573 590-591 
632 647-648 650 655 
682 690-691 700-701 
746 782 784 788-789 
829 840-841 847 854 
897-900 904 919 925 
948-949 954 960-962 
9B6 996 1000-1C01 10 
1014 1022-1028 1045 
1068 1070 1072 1078 
1090 1109 111S 111B 
1136-1137 1144-1145 
1157 1193-1195 1198 
1220 1222 1234 1257 
1274-1275 1280 1285 
1312 1314 1317-1320 
1344-1345 1349-1350 
1358 1364 1369 1379 
1431 1435 1476 1507 
1536 1547 1554 1564 
1582 1587 1593 1595 
1615 1619-1621 1638 
1665-1666 1673 1687- 
1715 1723 1728 1749 
1759-1761 1765 1771 
1778 1781-1782 1786 



fetal brain 



Invxtrogen 



FBT002 



10S 124 ISO 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal heart 



^Cnvitrogen 



FHR001 
FKD001 



fetal kidney 



Clontech 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal kidney 



Clontech 



FKD002 



fetal kidney 



258 277 280-281 307 310 314 330 
371 387 392 395 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 780 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 
352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



Invitrogen 



FKD007 



fetal lung 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



clontech 



FLG001 



fetal lung 



35-36 94 323 371 393 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



Invitrogen 



FLG003 



fetal lung 



fetal liver - 
spleen 



Clontech 



Columbia 
University 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 426-427 430 432 467- 
468 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 13S5 1369 13B1 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1S57-1560 1567 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



FLG004 



103 276 334 
1614 1658 



465-466 737 843 1131 



FLS001 



3-11 13 15 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299 
311 314 316 
332 342 
358 360 
386-387 390 
406 408 410 
37 439-442 
56 459 461 
87-488 490 
506 509-513 
529 531 534 
53-554 561- 
76 579 581 



344 
362 



21 25 30 
60-66 6 
85 87 8 
-124 126 
144 147 
167-172 
-190 193 
210-214 
-244 246 
261-265 
280-281 
-301 304 
318 320 
-345 350 
370-374 
392-393 
•412 415 
444-445 
470 472- 
491 493 
515-520 
536-540 
562 564 
583 585- 



-39 41-4 
8-69 72 
9 92-103 
-127 130 
-149 152 
174 176 
-194 196 
219 221 
-247 250 
268-269 
284-286 
306-307 
-321 326 
352-353 
376 378- 
400-401 
417 419 
448 452- 
479 481- 
500-501 
522-524 
542 547- 
567-568 
597 599- 



8 50- 
75 
105- 
133 
-153 
-178 

198- 
-231 
-251 
272 
288 
309 
329- 
356- 
•384 
403 
422- 
454 
483 
503- 
526- 
54 9 
571- 
605 
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Tissue Origin - 



EN A Source 



Hyseq 
Library Name 



SEQ ID NOS: 



607 610-613 615^621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 116-119 725-12B 730- 
731 734 736 738 740-741 743-746 
748 750-751 759-766 768 772 7v74- 
777 779 783-788 793 796 798 800- 
805 808 010-812 814 818-819 821- 
824 826-832 834-837 843-847 849- 
867 869-876 878-883 887 889-895 
897-898 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
560-961 963-565 967 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-10B7 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 134S-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1S11-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



fetal liver- 
spleen 



Columbia 
University 



F1»S002 



3-11 13 15-21 26 29 
44-45 48 50-S1 54-55 
68-69 73-75 78 80 82 
100 103 105 107-108 
116-119 122-125 128 
145 147-153 155 157 
166 168 171-172 174- 
188-189 193-194 196- 



32 35-39 42 
57-S8 61 64 
84 87 95-98 
110 112-113 
130 137-138 
159 161-163 
175 177 181 
198 200-203 
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SEQ ID NOS: 



Tissue Origin | RNA Source 



Hyseq 
Library Name 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOS 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 6S4-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 7B0 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-856 
858-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-10S6 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1140-1150 
1156 11S8 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 128B-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 X400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 1500-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1S21 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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Tissue Origin 



RNA Source 



Hyseq 
library Name 



SEQ ID NOS: 



1597- 


1^98 


1600- 


•1601 


1611- 


1612 


1618- 


1628 


1630- 


-1631 


1635- 


1638 


1641 


1646- 


1649 


1652 


1654- 


1659 


1661- 


1662 


1664 


1667-1669 


1674 


1676- 


1679 


1683-1684 


1686- 


1688 


1691- 


1692 


1699 


1702 


1707 


1711 


1713- 


1714 


1717 


1719 


1722 


1726- 


1727 


1730- 


1733 


1738 


1740 


1743- 


1744 


1748- 


1752 


1758 


1760- 


1761 


1763- 


1764 


1767 


1769 


1772- 


1773 


1776 


1779 


1783- 


•1786 







fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



fetal liver 



Invitrogen 



FLV001 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 2S9 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 39S 408 412 414 419 425 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 017 829 037 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



C Ion tech 



FLV002 
FLV004 



676 998 1719 



fetal liver 



Clontech 



93 133 214 301 355 
581 601 679 837 847 
1236 1270 1313 1324 
1355 1367 1425-1426 

1733 1760-1761 

26 37-39 50-51 58 8 
113 128 131-132 139 
194 198 201 206 211 
261 276 282 286 302 
376 379 383 398 412 
436 448 452 462-463 
519 529 561 569-570 
607 623 626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1060 1064 107O 



374 379 555 
859 1123 
1325 1327 
1536 1690 



4 86 
155 
230 
325 
413 
473 
590- 
660 
775- 
915 
992 
1035 
1083 



tetal muscle 



Invitrogen 



FMS001 



89 98 
172 186 
231 256 
359 361 
419 430 
477 503 
591 597 
672 715 
777 788 
921 935 
1000- 

1036 

1097 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



T099- 
1173 
1266 
1324- 
1383- 
1433 
1557- 
1632 
1712 
1766 



1102 
1198 
1270 
1325 
1384 
1505 
1S59 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1514 

1562 

1650 

1726 



1117 

1228 

1298 

1336- 

1400 

1542 

1589 

1652 

1743- 



1121 
1240 
1317 
1337 
1403 
1551 
1599 
1671 
1744 



1164 
12S8 
-1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skin 



Invitrogen 



FSK001 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 116-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-3S2 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 S09 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
945 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 122S 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1559 1588 
1S92 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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Tissue origin 


RNA Source 


Hyseq 




SEQ 


ID NOS: 








Library Name 


















1626 1632 1634 


1636 


1641 1643- 








1644 1646 1654- 


1657 


1660-1662 








1665 1668 1675 


1685 


1687-1689 








1702-1703 1709- 


1710 


1716 1719 








1724 1727 1731- 


1732 


1737-1740 








1742 1747 1749 


1755 


1760-1761 








1765 1772 1776- 


1777 


1779-1780 








1786 








fetal skin 


Invitrogen 


FSK002 


13 286 302 307 


313 


321 330 33S 








339 


341 354 370 


3 72 


385 


400 402 








408 


414 426-427 


433 


436 


450 454 








515 


544 585 598 


767 


810 


845 939 








1076 1109 1155 


1317 


-132C 


1326 








1333-1335 1343 


1347 


1350 1369- 








I37i 1377-1378 


1391 


139-J 


1422 








1466 1647 16S6 


1678 


-167S 


1687- 








1688 1693 1718 


1721 


1725 1731- 








1732 


1739 1755 








fetal spleen 


BioCha in 


FSP001 


110 


137 211 353 


589 


927 


1108 








163S 


1771 








umbilical cord 


BioChain 


FUC001 


4-8 


10 12 14 17 


33-36 44 


-46 57 








64 68-69 75 82 


85 101 104 113- 








114 


116 119 122 


-124 


133 


137 153- 








154 


157 161 163 


166 


-167 


175 181- 








184 


186 192 197 


-198 


200- 


202 212- 








215 


230 234 246 


-247 


251 


256 263 








267 


271-272 280 


-281 


284 


295 301 








314 


317 321 326 


333 


-335 


345 351 








356 


368 371-373 


379 


-380 


386 390 








392 


394 406 408 


-410 


412 


414 416 








420 


424 427 430 


-436 


438 


444-446 








454 


459 461 463 


467 


473 


482-483 








486 


488 490 495 


504 


509 


524 526 








537- 


540 547 555 


561 


574 - 


577 588- 








591 


593 606 615 


620-621 


632 637 








645- 


647 650 659 


-660 


662- 


664 667- 








668 


674-675 684 


687 


696 


698 701 








703- 


705 709 711 


714 


719- 


720 725- 








727 


732 749-750 


762 


765 


771 775- 








777 


780 789-791 


793 


796 


802-803 








814- 


817 822 833 


843 


845 


848 858 








861 


864 875 879 


888 


894- 


895 897- 








900 


903 906-907 


911- 


■912 


925 930- 








933 


936 940 948 


953 


960 


966 977 








984 


990 992 998 


1000-1001 1005- i 








1007 


1016 1023 1025 


1037 


1046- 








1047 


1059 1061-1063 


1073 


1076- 








1077 


1089 1094-1097 


1112 


-1113 








1115 


1134 1144-1148 


1151 


1154 








1156 


1163 1171 1197 


1204 


-1205 








1208 


1216 1218 1224 


1234 


-1235 








1243 


-1244 1246 1279 


1283 


1286- 








1287 


1298 1316 1320 


1344 


1346 








1350 


1357 1359 1371 


1373 


1375 








1381 


1398 1400 1403 


1408 


1414 








1424 


1427-1428 1431 


1433 


1440- 








1442 


1446 1454-1455 


1479 


1482 








1484 


-1485 1489 1492- 


1493 


1504- 








1505 


1513 1525 1527 


1536 


1538 








1546 


1565 1567 1571 


1573 


1575- 








1576 


1578-1579 1591 


1595 


1600- 








1601 


1608 1612 1615 


1621 


1624 








1626 


1636-1637 1647- 


1646 


1651 








1653 


1656 1658 1661- 


1662 


1672 








1675 


1682 1684 1686- 


1668 


1690 








1709-1710 1722 1727 


1729 


1735- 








1738 


1740-1741 1760- 


1761 


1768 


fetal brain 


GIBCO 


HFB001 


4 9 11-13 17-18 


22-23 25 


37-39 








42-47 50-51 54-55 58 


60-61 65-66 
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Hyseq 






SEQ 


ID NOS: 








— ) J- Diary jsdine 






















72 75 77 80 


82 


85 90-91 


94 100- 








102 


107 


110 


112-116 


118- 


119 122- 








123 


126 


128 


134 136 


-140 


147-148 








153 - 


155 


157 


161 165 


169- 


172 175 








181 


186 


188- 


-189 197 


-198 


204-206 








208 


210 


215 


222-223 


225-226 23D 








23 5- 


238 


240- 


-241 247 


253 


256-258 








260- 


262 


267- 


-269 276 


279- 


281 284 








286 


289 


298 


300-302 


307 


310 318 








321- 


323 


325 


33 


0-331 


339 


341 346- 








349 


352 


354 


356-359 


362 


3 64 - 365 








371- 


372 


377 


379-380 


3 82 


384 387 








390 


400 


408 


414-416 


415 


424 431 








434 - 


435 


438 


441-443 


449 


451 453- 








455 


457-463 


470 472 


-473 


47c d*7*7- 
•i f O *k f f — 








478 


482-483 


486-488 














499- 


-500 


502-504 


DUO - 












516 


519- 


-520 522 


COR- 










JJ u 


537- 


-540 


543-544 


54 6- 


547 566- 








SO / 


569- 


-570 


572-582 


585 


r n n e fi n 








~> j JL 


593 


595 


59 


9 601 


604 


606 - 609 








6ll - 


612 


614- 


62 


0 622 


-624 


630 632 








CI C 
Oj O 


643 


645- 


64 


7 650 


-652 


654 659 








ob J. 


665 


667- 


66 


B 670 


-672 


676 678 








G81 


687 


689 


692-694 


697 


699 710 








714 


717 


721 


72 


7 729 


-732 


734 736 








73 8 


743-746 


750-751 


759 


763 766 








Tin 


772 


775- 


777 784 


7B9 


791 796 








*7QO 


802-805 


810-811 


814 


819 - 821 








824 


826 


830 


834-837 


839- 


850 854- 








ocr 
obo 


858- 


860 


862 864 


869 


871 876- 








877 


879 


883 


886-887 


890- 










895 


898- 


901 


90 


5 908 


-910 


912 - 916 








919 


922- 


923 


925 927 


930- 










3j o 


948 


952- 


96 


3 963- 


-964 


967 969 — 








0*7 0 


975 


978- 


979 981 


983 


one _ QO*7 








oqn 


992 


995 


997 999- 


-1002 


1 005- 








1009 


1011-1013 


1016 


1018 


-1019 








1023 


1026 1029-1031 


1033 


— 1035 








103 8 


1041 1047 


1050 


1053 


1057 








1059 


1064 1068 


1070 


1072 


-1073 








1078 


-1079 1081-1082 


1086 


1089 








1094 


1097 11 


03 


1107-1109 


1113 - 








1115 


112 


1-11 


22 


1127 


1134 


-1135 








1138 


1140 1143 


1148- 


-1151 


1153 








1156 


-11S7 1159 


1167 


1170 


1175 








1193 


-1194 12 


00 


1202 


1207 


-1209 








1211 


121 


C 12 


19- 


-1220 


1226 


-1227 








1229 


123 


2-12 


34 


1240- 


•1241 


1243 








1246 


1249-1251 


1253-1254 


*1258 








1267 


-1268 1271 


1276 


1279 


1282 








1285 


-1289 1293-1294 


1305 


1307- 








1308 


1312 1316 


1320 


1327 


1338- 








1339 


1341-13 


44 


1346 


1349 


1355- 








1357 


1359 13 


65- 


1366 


1369 


-1370 


• 






1373 


-1375 13 


79 


1386 


1389 


1394 








1398 


1409 1413- 


1414 


1416 


-1417 








1420 


-142 


1 14 


25- 


1427 


1430 


1433 








1437 


143 


9 1442 


1445- 


1452 


1454- 








1457 


1459 1463- 


1464 


1468 


1470 








1474 


1477-14 


79 


1489 


1492 


1494 








1497-149 


8 1501- 


1503 


1507 


1509 








1511-1513 1517 


1S20- 


1521 


1524- 








1526 


1531-1533 


1535 


1537-1S38 








1S47 


1554 1556- 


1559 


1564-1567 








1571 


1584 1587 


1589 


1594 


1599- 








1601 


1611-1612 


1614- 


1616 


1619- 








1620 


1625-1628 


1630- 


1631 


1634 








1637- 


-163 


8 1640- 


1643 


1645 


164 8- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1649 1651 1653-1655 1657-16S8 
1664-1665 1667 1669 1673 1678- 
1679 1683-1684 1686 1693 1701 
1704-1705 1709 1713-1714 1717- 
1720 .1724 1727-1728 1731-1733 
1737-1738 1743-1744 1752 17S4- 
1755 17S7 1760-1761 1765 1772 
1779 1785 



macrophage 
infant brain 



Invitrogen 



HMP001 



Columbia 
University 



IB2002 



5-8 110 204-205 503 634 678 859 
878 933 988-989 13 79 1448 1504 

10 12-13 15-18 22-23 25 29 34 

37-39 43 47 50-51 54-56 58 60-63 
65-66 68-69 72-74 80 82-83 86 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 128 130 
134-136 138-139 143 147-149 
152 154-155 163 165-167 169 
175 181-184 186 193-196 198 
203-205 209-210 214-215 222 
226 231-232 235-236 239 246- 



151- 
172- 
201 
224- 
-247 



252 257 260 268-269 272 276-277 
279-281 286 288 291-292 295 298 
300-301 304 307 310 313 321-323 
330-331 333-334 339 346-347 349 
352 356-357 362 371-372 377 379- 
380 383-384 392 397 401 406 408 
411 413-414 416 418-419 422 428 
430-431 434-435 438 443 449 453- 
454 461 464-466 469-470 472-473 
475-476 478 482-483 487 490 492 
494 497 503 507-508 510-513 516 
519-520 524-526 530-534 536-540 
547 550-5S1 561 563-564 566-567 
572-576 579 581-582 S84-S07 590- 
591 593 595-597 607-609 611-613 
616-617 620 622-624 627 631 637 
641 645-647 650-655 657-658 660- 
665 667-675 689 691 695 697 699 
703 707 713-715 717 721 728-731 
733-736 739 743 745 751 755 7S9 
763 769-770 772 778 780-781 785 
788-789 793-794 799 803 80B 811 
814 825-826 830 834-836 840-843 
845 848-850 854-855 860 862 864- 
865 870 872 875-876 878 886 883 
890-891 894-896 898 903-904 916- 
917 919 922-925 927-928 930-932 
934-936 938 941 945-946 948-950 
953-954 959-962 966-969 977 979 
981 986-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-1082 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1188 
1190 1193-1194 1196-1197 1199 
1204 1208-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1256 1258 1261- 
1262 1269 1274 1279 1281 1283 
1285 1287-1289 1294-1295 1305 
1307 1313-1314 1316-1320 1329 
1332 1341-1342 1345 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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Tissue Origin 



infant brain 



RNA Source 



Hyseq 
Library Name 



Columbia 
University 



TB2003 



infant brain - 



infant brain 



Columbia 
University 



IBM002 



SEQ ID NOS: 



1423 1429-1431 1435-1436 1439-' 
1441 1443 1447-1449 1451-1452 
1454-1455 1457 14S9 1463-1465 
1468 1470-1471 1475 1479 1482- 
1483 1485 1493-1494 1496 1490- 
1499 1502-1503 1505-1507 1509 
1522-1523 1525 1528 1531-1533 
1542 1546-1547 1549-1550 1554- 
1555 1563 1565-1567 1569 1575 
1580 1583-1586 1588 1590 1592- 
1593 1595 1598 1600-1601 1608- 
1610 1612 1614-1616 1619 1621 
1624 1626-1627 1630-1633 1637 
1639-1640 1642 1644 1647 1652 
1654-16S5 1658-1659 1664-1665 
1672-1673 1676-1681 1685-1688 
1693-1695 1701-1702 1704 1708 
1717-1720 1723-1724 1726-1728 
1733 1735-1741 1743-1744 1752 
1755-1758 1762 1765 1771 1774 
1777-1778 1786 



Columbia 
University 



IBS001 



17-18 20-23 29 34 43 60 €8-69 
78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-281 286 290-292 295-300-301 
310 322 324 331 334 339 346-347 
349-350 3S2 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540 551 563 572- 
576 585 597 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674 
675 678 609 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-1289 1305 1314 1327 1333 
1344 1347 1350 1355-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 15S7- 
1559 1567 1572 1S87 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 

101 113 139 152 26^0 279 290-292 
374 377 551 563 608-609 653 6S9 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



10 12 119" 175 279-281 321 334 

371 446 551 563 623 652 667 669 
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SEQ 


ID NOS: 










T.i h7~a tv Nam? 
























671- 


672 


819 


949 


966 


1113 


1130 








1151 1188 1193- 


1194 


1196 


1229 








1256 


1265 1271 


1207 


1317 


-1319 








1324 


-1325 1342 


1423 


1440-1441 








144£ 


1471 1482 


1525 


1532 1546 








1562 1569 1588 


1591 


161C 


1618 








1647 1649 1658 














XjFBOOI 


5-9 


17 20-21 25 


68-69 82 94 


105 


fibroblast 






153 


157 


197 


-198 


203 


207- 


208 


212- 








213 


223 


262 


266 


283 


302 


321 


326 








333 


356 


370 


427 


430 


436 


446 


462 








472 


493 


498 


503 


516 


519 


527 


535 








537- 


540 


542 


-544 


562 


565 


567 


586 








599- 


600 


607 


615 


630 


647 


662- 


-664 








692- 


694 


712 


719 


745 


748 


775- 


•777 








794- 


796 


810 


837 


843 


-847 


849 


854- 








856 


869 


876 


903 


934 


953 


955-956 








964 


975-976 


984 


1000 1005-1007 








1024-1025 1033 


1039 


1053 


1064 








1070 1072 1082 


1112 


-1113 


1134 








1136-1138 1140 


1195 


1223 


1232- 








1233 


1246 1279 


1285 


1295 1311 








1320 1334-1335 


1343 


1427-1428 








1446 


1478 1482 


1493 


1504 


1537 








1552 1555 1567 


1575 


1582 


1598 








162C 


1625 1632 


1638 


164E 


1654- 








1655 1662 1680- 


1681 


1684 


1686 








1690 1696 1702 


1711 


1733 


1741 








1760-1761 1778 


17 85 








j lung tumor 


Invi txocfen 




5-10 18 


20-21 29 33 


-36 40 43 52 




54-55 61 65- 


-66 


68-70 73- 


75 80 85 








88-89 93-94 


100 


103 


106- 


108 


112- 








113 


115-116 


118 


-119 


123- 


124 


126 








130- 


132 


135-137 


139 


-141 


143-144 








147- 


14 8 


151-153 


155 


-156 


159 


161 








164 


169 


171 


179 


-180 


185 


190 


192 








194 


196- 


-199 


203 


-208 


210 


212- 


-214 








216- 


217 


219 


222 


233 


240- 


241 


244 








246 


251 


-252 


255 


-256 


261- 


262 


256. 








272 


276- 


-277 


279 


-281 


284 


286 


286 








290 


295 


298 


301 


-302 


309- 


312 


317 




■ 




321 


329 


332 


341 


-342 


344- 


345 


348 








352 


358- 


-360 


363 


368 


370- 


371 


376 








380- 


381 


3 84 


389 


-390 


398 


400 


409 








414 


423 


426-427 


430 


432- 


436 


443- 








444 


450-451 


454 


462 


468 


472-477 








480- 


483 


487- 


-488 


490-491 


493 


496- 








498 


500 


503- 


•506 


509- 


-512 


515- 


•516 








519 


521- 


-523 


526 


530 


534 


541 


544 








547 


554 


557 


564 


566- 


-567 


S72- 


576 








585- 


586 


588- 


-589 


595- 


-596 


601 


607 








611- 


612 


615 


619 


621 


623 


626 


630 








632- 


633 


644 


64 7 


649 


651 


655- 


656 








660 


662- 


-665 


667 


669 


672 


683- 


684 








696 


700 


706 


710 


713 


716 


718- 


-719 








722- 


723 


728 


734 


-739 


743 


750 


752 








763 


765-766 


773 


-778 


784- 


785 


787- 








789 


791 


800 


802 


-803 


809- 


812 


814 








824 


826 


828- 


829 


832 


838- 


839 


841- 








845 


849-850 


852 


-855 


857- 


861 


864 








666 


874 


878- 


880 


882 


887 


890- 


891 








897- 


898 


902 


904 


906- 


•907 


910 


916 








918- 


920 


922 


924 


-925 


927 


930- 


932 








934- 


935 


937 


947 


950 


953 


955- 


956 








961 


963 


966- 


967 


969 


971 


977- 


979 








981 


984 


986- 


987 


990 


992- 


993 


995 








997 


999- 


1001 


1005-1007 1009 










1012 


-1013 1018 1020 


1022 


-1024 








1026 


1029-1030 1033 


1038 


1041 
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RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 








1045 


1047 


-10S0 


1052 


1054 -1055 








1059 


1063 


-1064 


1067 


-1071 1073- 








1074 


1078 


1085 


1087 


1089 1095- 








1097 


1104 


1106 


-1107 


1109 1112 








1116 


-1117 


1119 


1126 


1134-1135 








1139 


1141 


-1142 


1144 


-1145 1148 








1152 


-1153 


1156 


-1158 


1167 1170 








1172 


1178 


1195 


-1196 


1198-1200 








1202 


1204 


1208 


1214 


1216 1219 








1222 


1227 


1234 


1241 


1247 1259 








1257 


-1258 


1265 


1267 


-1270 1276 








1278 


1280 


-1281 


1283 


1285 1?RR- 








1289 


1295 


1300 


1305 


1308 1319 








1317 


-1321 


1329 


1338 


-1339 1341 








1344 


-1346 


1349-1351 


13^1- 1 'i cc 








1357 


1365 


-1366 


1369 


X.3 f O - X J / y 








1383-1385 


1394 


1397 


i dnrt i a no _ 

I'wU J.** U<£ - 








1403 


1408 


1417 


1419 










1431 


1433- 


-1436 


1438 


~\&AA 1 / / C 

X**l** X«t4b- 








1448 


1454- 


-1455 


1460 


1 ACC "» ACO 








1470 


1474 


1480- 


1481 


1 Afll 1 AQC 

XfsOO X^lob — 








1488 


1490-1491 


1494 










1508- 


■1509 


1511- 


1512 


1515— 1516 








1519 


1523- 


•1524 


1528 


-1529 1536- 








1540 


1546 


1549- 


1550 


1555 1560- 








1561 


1565 


1567 


1569 


1575 1588 








1591 


1593- 


1594 


1596 


-1598 1600- 








1602 


1608 


1614- 


1616 


1618 1620 








1624- 


1625 


1627- 


1632 


1636 1639 








1644- 


1645 


1647- 


1649 


1652-1653 








1656- 


1662 


1664 


1666 


-1667 1670- 








1671 


1673- 


1675 


1678- 


-1679 1683 








1685- 


1688 


1690- 


1692 


1696-1699 








1705 


1709 


1716- 


1717 


1722 1727 








1730 


1735 


1739 


1741 


1743-1744 








1748- 


1749 


1753 


1760-1762 1765 








1767 


1770- 


1771 


1773 


1775-1776 








1778- 


1779 


1786 






lymphocytes 


ATCC 


LPC001 


4 11- 


12 18 


24-25 30-31 48 50-51 








56-57 


68-69 80 


92 98 103 105 110 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217- 


218 222-223 








225-226 22 


9 231 


247 


251 256 264 








272 280-28 


1 284 


300- 


301 321 325- 








326 339 34 


8 352 


357 


371 382 384 








390 400 40 


4 412 


414 


421 423 426- 








427 430-43 


1 445 


447- 


448 451 454- 








455 4 


75 503 516 


526- 


527 530 537- 








540 549 556-560 


563 


574 577 5B9 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657- 


659 690 697 




- 




717 723 755 764 


775- 


777 780 786 








789-790 793 800 


8 02 


822 838 849 








866 869 87 


5 881« 


-883 


892 898 906- 








907 911 921-923 


928 


975 990 992 








996 1001 1004-1007 1 


033 1050 








1054 1078 1107 1135 


1140-1141 








1143 1148 1158 1163 


1177 1199 








1205 1216 1226 1231 


1236 1241 








1244 1250 1258 1260 


1265 1269- 








1271 1290-1293 1308 


1312 1317 








1319-1320 1339 1345-1346 1348 








1350-1351 1357 1367 1369 1379 








1381 1383-1384 1386-1387 1389 








1394 1397 1405 1423 1425-1428 








1431 1437 1446 1448 1461 1466 








1470 1472 1474 1482 1492 1506 








1528 1537 1546 1 


549 1591 1598 








1600 1603-1604 1606 1627 1636 
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Tissue Origin 



RNA Source 



Hyseq 
Lib r ary Name 



SEQ ID NOS: 



1638 1647-1649 1651 16SB-1659 
1664 1676-1677 1680-1681 1687- 
1680 1699 1711 1715-1716 1726 
1728 1737 1740 1746 1748 1752 
17S6 1758 1777 1779 



3-4 10-11 13 15-18 20-21 24-25 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107*108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
212-215 217-219 222-223 229 235- 
236 247 251 255-258 260 262 272 
274-277 280-281 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-451 453-454 456 459 461- 
464 468-472 474-479 481 483-485 
487-491 496 499-501 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 S53-559 
566-567 571 574-577 579 582 564- 
586 589 593 595-597 601-602 604 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 655 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 700 706 
708 710 716-720 725-726 729-736 
738-739 743-746 749 751 753 756 
759 765-766 768 770-773 780 784- 
786 788-790 793 796 793 800 802- 
803 810-811 814 817 819 826 828- 
830 832 834-836 838 843 845-860 
863-864 866-871 877-879 881-892 
894-896 898 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949 953 955-956 958 960- 
962 964 967 970-971 973 975 977 
985-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107 1110-1113 1115-1117 
1122-1123 1125 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 12S4-1265 1269- 
1270 1272-1275 1277 1280-1284 
12B7-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 142S- 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 



leukocyte 



GIBCO 



LiUCOOl 
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Tissue Origin 



RNA Source 



Hy3eq 
Library Name 



SEQ ID NOS: 



1453 


1458- 


1459 


1463- 


1464 


1468 


1470- 


1471 


1474 


1477-1478 


1482- 


I486 


1490- 


1493 


1496-1501 


1504 


1506 


1509 


1512-1513 


1516 


1519 


1521- 


1522 


1524- 


1525 


1527- 


1528 


1531 


1534 


1538 


1541 


1545- 


•1547 


1549-1550 


1553 


1555- 


1556 


1560 


1565 


1567 


1575 


1580 


1589 


1591 


1594 


1596 


1598 


1600- 


1602 


1606- 


1608 


1611 


1614 


1620- 


•1621 


1624 


1626- 


•1629 


1631- 


•1632 


1636 


1638- 


1639 


1641 


1644-1645 


1648- 


•1650 


1653- 


-1655 


1658- 


-1660 


1662 


1669- 


1670 


1675 


-1679 


1684-1688 


1690- 


1692 


1696 


1700 


1702 


1707- 


-1709 


1711 


1716 


-1717 


1720 


1723 


1725- 


1727 


1733 


1737-1738 


1741 


1743- 


1744 


1748-1749 


1752 


1755 


1760- 


1762 


1765 


1769 


1771 


-1772 


1781- 


1784 


1786 











69 75 82 102~ 
244 280-281 
455 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



leukocyte 



Clontech 



LUC003 



4 35-36 44-45 61 68 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1113 1119 1170 
1236-1237 1241 1275 
1357 1359 1377 1506 
1553 1591 1600 1613 
1628 1670 1676-1677 

1699 1733 1738 1772 

25 35-36^ 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
461 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 7S9 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL0O4 



mammary gland 



Invitroseii 



MMG001 



5-8 10 12 14-18 20-21 24-25 29 
33-39 42-43 52 55-58 60-64 68-69 
71 73-74 79-80 82 89 98 100 103 
106 108 112 123 128 133-137 144- 
146 148 150-152 154 1S8-159 165- 
166 170-172 174 176 178 181-185 
188-190 194-198 201-206 210 217- 
222 224 227-228 231 233-237 247 
251 253-254 256 261-263 266-267 
271 276-277 279-281 284-286 288 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



290 7.31 299 301 304 
320-321 323-325 327- 
334 339 341 344-34S 
359-360 362-363 368 
303 380 390 393-395 
4C6 412 414-415 423 
441-444 448 451-455 
476 479 482 485-486 
4S5 498 503 506 509 
519-520 522 527 529 
.547 549 554 557 562 
589-S91 597 602 607 
629 632 634-640 644 
652 655 657-658 660 
672 674-676 679 682 
706-707 710 713 717 
732-734 736 738 743 
755 759 761 766 770 
789 794 803 806-807 
B22 827-829 837 842 
864 866 869-870 872 
893-900 904 906-907 
921-923 926 935-937 
953-954 957 960-961 
970 977-978 984-989 
1000-1001 1005-10C6 
1014 1016-1017 1023 
1032-1033 1036 1039 
1055 1057-1058 1063 
1077-1078 108S 1087 
1095-1102 1107-1108 
1121-1123 1131-1133 
1139-1142 1144-1145 
1153 1159 1167 1170 
1183-1185 1190-1192 
1207-1208 1212 1216 
1223 1225 1231 1234 
1247 1253-1254 1258 
1262 1270-1280 1283 
1298 1307 1314 1316 
1325 1330 1334-1335 
1349-1352 1354-13SS 
1370 1377 1379 1381 
1389 1405 1414 1419 
1425-1426 1428-1429 
1437 1439 1448-1449 
1460-1464 1466 1471 
1487 1489-1491 1493 
1512 1519 1526-1528 
1536 1539 1542 1547 
1554 1561-1562 1564 
1576-1579 1581-1582 
1592 1594 1596-1597 
1607-1608 1610 1612 
1621-1622 1625-1626 
1636 1641 1643-1644 
1652 1654-1655 1657 
1662 1664-1666 1669 
1674 1676-1677 1680 
1692 1701 1706 1713 
1720 1723-1728 1730 
1740 1742-1744 1746 
1751 1753 1760-1762 
1771 1774 1776-1777 
1784 1786 



309-312 318 
329 331-332 
348 350 356 
371 376 379- 
397-398 405 
430 434-437 
462-464 474 
488 490 494- 
512 516-517 
534 537-541 
572-574 587 
618 623 628- 
647-648 650- 
665 667 669- 
688 695-696 
720 722-730 
747-748 750 
780 784 706- 
809 814 817- 
854-858 863- 
678 881 889 
911 916 919 
946 948-949 
963 965-966 
993-997 
1008 1013- 
1025 1027 
1043 1045 
1068-1075 
1089-1091 
1112-1119 
1136-1137 
1148-1149 
1172-1173 
1196-1199 
■1218 1222- 
1240-1241 
-1259 1261- 
128S-1286 
•1320 1323- 
1342-1345 
1359 1369- 
1383-1384 
1421-1423 
1431 1434- 
1454 1457 
1480-1483 
1505 1S07 
1532 1534 
1549-1550 
1567 1572 
1587-1588 
1601-1602 
-1616 1618 
1631 1635- 
1647 1650 
-1658 1660 
-1671 1673- . 
•1685 1689- 
•1715 1719- 
•1732 1738 
-1747 1749 
1765-1768 
1779 1783- 



induced neuron 
cells 



Strategene 



NTD001 



29 35-36 80 116 123 

214 230 280-281 284 

330 340 358 371 375 

422 424 492 497 532 



156 163 181 
•285 307 321 

377 380 382 
•533 542 546 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



retinoid acid 

induced 
neuronal cells 



neuronal cells 



Strategene 



NTR0 01 



549 566 586 595 612 6~4fi-647 654"" 

734 775-778 780 752 799 821 826 

856 B58 875 936 953 905 990 992 

1041-1043 1055 1072 1104 1193- 

1194 120G 1223 1246 1253 1274 

1288-1289 1291 1294 1311 1320 

1349 1359 1412 1423 1485 1620 

1623 1645 1684 1705 1715 1751 



Sfcrategene 



5-B 78 268-269 277 383 431 506 
623 677 731 999-1000 1199 1425 
1426 1547 



NTO001 



29 65-66 80 82 110 119 146 15~2 

166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 4B8 503 506 S10-S12 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



pituitary 
gland 



Clontech 



PIT004 



placenta 



311 314 379 408 419 430 454 1055 

1095-1096 1272-1273 1312 1320 

1378 1652 1671 1720 1725 1736 
1741 1755 



Clontech 



PLA003 



prostate 



Clontech 



5-8 124 20B 277 370 843 906-907 
1280 1317-1319 1359 1609 1621 
1737 



PRT001 



rectum 



Invxtrogen 



REC001 



9 46 S7 71 107 147 171 177 197 

201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-S06 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 B61 
871 874 890-891 905 938 945 963- 
964 9B8-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 843 849 851 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
11D8-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 1474 1477 








1482 1546 1587-1588 1592 1596 








1610 1622 1627 1644 l€58 1662 








1665-1666 1669 1675-1677 1749 








1786 


salivary gland 


Clontech 


SAL001 


10 55 97 103 110 140 149 152 158 








198 217-718 242-243 256 301 308 








312 321 333 351 354 360 410 437 






• 


448 473 487 494 496 501 S35 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








1055 1066 1103 1150 1172 1181 








1234 1281-1282 1288-1289 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 14B2 1492 1494 1498 1511 








1523-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 1658 1665 








1671-1672 1691-1692 


salivary gland 


Clontech 


SALS 03 


158 326 1423 1463-1464 


skin 


ATCC 


SFB001 


1320 1400 


fibroblast 








skin 


ATCC 


SFB002 


262 736 1025 1253 


fibroblast 








skin 


ATCC 


SFB003 


709 1119 1350 1631 1653 


fibroblast 








small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 28S 288 29H 








301-302 308 312 334 340 371 3 9fl 








408 412 414 416 423 425-427 430 








434-435 445 452 454 478 503 5lS 








519 521 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








1501 1521 1550 1S56 1585 1597 








1636 1638-1539 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


skeletal 


Clontech 


SKM001 


18 20-21 82 84 101 118 134 148 


muscle 






151 153 166 225-226 258 274 277 








289 329 361 412 414 424 440 452 








459 470 488 503-504 537-540 647 








660 673-675 715 773 780 786 830 








905 922 950 963 982 990 992 1020 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1.683 1712 


muscle 








skeletal 


Clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKMS04 


235-236 


muscle 








spinal cord 


Clontech 


SPC001 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin [ RNA Source 



Hyseq 
Library Name 



SEQ IP NOS: 



adult spleen f clontech 



SPLcOl 



82 85 92 94 108 110 116 139 157 
167 198 204-205 210 215 229 256 
259 277 280-281 300-302 304 315 
317 372 379 387 392 419 426-427 
430 433 448 467 473 487 489 506 
509 513 519 524 526 537-S40 543 
547 549 551 559 567 569-570 593 
607 616-617 623 625 637 649-650 
652 657-658 670-671 673 679 6ei- 
682 709 711 715 719 728-729 734 
749-750 753 775-777 782 785 791 
809 820 832 834-836 847-849 854- 
855 858 861 864 871-872 875 884 
898 906-908 917 919 924 934 942 
944 970 985 990 992-993 998 1013 
1039 1053 1059 1065 1072 1075 
1077 1082 1085 1097 1103 1109 
1116-1117 1128 1134 1151 1170 
1174 1192-1194 1215 1225 1241 
1243 1283 1294 1307 1312 1320 
1323 1327 1330 1350 1353-1354 
1356 1359 1368 1375 1400 1406- 
1407 1423 1429 1437 1443 1448 
1454 1470 1482 1492 1501 1508 
1511 1529 1538 1548-1549 1565 
1571 1578 1598 1600 1614 1625 
1627 1630 1639 1646 1651-1652 
1670 1686 1696 1740 1751 1755 
1771 

117 312 326 348 424 426-427 431 ~ 



845 866 1320 1330 1333 1344 
1355-13S7 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



Clontech 



thalamus Clontech 



SXOO 0 1 I 10 15-16 61 68-69 100 117 149 

197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 562-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1S57-15S9 1622 
1634 1651 1653 1729 



rHA002 | 9 11 25 85 87 112 137 146 180 

190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 32S 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1244-1145 
1150-1151 1157 1172-1173 1177 
2193-1194 1208 1220 2249 2280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 2546 2549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1652-1652 1672 1687- 
1688 1703 1743-1744 1746-1747 
1753 



thymUS I Clontech t " THM001 144-4$ 54 $7-58 62-64 79 104 123 " 

126 134 153 193 212-213 218 242- 
243 258 274 277 279 297 301 307 
327 330 333 342 351 358 371 410 
430 445 465-466 468 471 483 487 
493 503 506 509 517 526 535 537- 
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Tissue Origin 



thymus 



RNA Source 



CI on tech 



Hyseq 
Library Name 



THMC02 



SEQ ID NOS: 



540 545 548 554 567 584 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 665 670 698 710 720 
728 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1536 1554 1620 1644 164C 
1649 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



5-9 15-21 25 33 3S-3£ 43-45 48 

50-51 54-55 60 75 83 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-2S2 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 4S4-456 461 
464-467 470 472 474-476 483 488 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 B70-871 881 
890-891 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 154S 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 
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Tissue Origin 


| RNA Source 


Hyseq 






SEQ ID NOS: 






I 


Library Name 












tnyroia gland 


| Clontcch 


THRO 01 


4 9 


-10 


20-21 37-39 


48 50-51 54- 








57 


60-61 65-66 71 83 94 


-96 98- 








100 


102 


104 110 112 


115 


-117 119 








123 


127 


133 136-13*3 


140 


149 152- 








153 


155- 


•158 163-164 


168 


-169 171 








186 


190- 


-192 197 201 


-203 


219-220 








229 


233- 


-237 246-247 


253 


256 258 








262 


265- 


-266 268-269 


277 


280-281 








284 


-2B6 


2B8-289 298 


-299 


302 309- 








311 


317 


321 326 332 


335 


341-342 








344 


348 


350 354 358 


-359 


363 368 








371 


-373 


382-383 385 


394 


398 400- 








401 


411 


414-415 421 


424 


430-431 








433 


-436 


443-446 450 


-452 


454-455 








458 


472- 


474 476-478 


482 


484-485 








487-48B 


490-494 496 


-497 


500-501 








503- 


-504 


506 509-513 


516-517 519 








524 


526- 


527 529 535 


-540 


547 549 








562 


564 


569-S70 575 


-576 


588 594- 








595 


601- 


602 604 606 


610 


612 615- 








617 


619- 


623 628-630 


634- 


-635 642 








647 


649- 


6S1 660 662 


-665 


668 670 








681 


690- 


694 696 698 


700 


TftO Til 








727- 


729 


732 734 738 


740- 


741 743 








745 


750 


759 761 763 


765 


Tin TT3 








780 


785 


795-796 798 


802 


o n a q o o 








824 


826 


828 833 838 


841- 


845 847 








849 


857- 


860 867 874-875 


ana a a 








881 


887- 


888 890-892 


894- 










908 


910- 


911 913-914 


922- 


923 926- 








927 


929 


932-934 937 


939 


941-942 








948 


9S3 


957 961 963- 


-964 


966 978- 








979 


981- 


982 987 990 


992 


1001 








1004 


-1006 1010 1014 


1020 


1024 








1033 


1038-1039 1044 


1047 


1050 








1052 


-1054 1055 1058 


1068 


1070- 








1071 


1077-1079 1088 


1094 


-1097 








1105 


-1106 1112-1113 


1116 


-1117 








1124 


1126 1128-1129 


1131 


1134 








1136 


-1137 1142-1143 


1146 


-1147 








1149 


-1150 1156 1161- 


1164 


1167 








1170 


-1173 1177-1181 


1190 


1192 








1197 


1200 1204 1206- 


1209 


1214 








1217 


1219 1222 1230 


1232 


-1233 








1235 


1241 1245 1247 


1254 


1257- 








1258 


1260 1262 1271- 


1273 


1283 








1286 


-1289 1299 1306 


1314 


1320 








1330 


-1332 1334-1335 


1342 


1345 








1349 


1365-1367 1370- 


1372 


1374 








1381 


1394 


1407 1419 


1428 


1436- 








1437 


1440 


-1441 1443 


1446- 


-1449 








1454 


1459 


1461-1462 


1468 


1470- 








1471 


1475 


1477 1479 1482 


1491 








1497-1498 


1504-1505 1507 


1513 








1522 


1524 


-1526 1528 1531 


1534 








1536- 


1537 


1548 1550 1553 


1555- 








1559 


1562 


1567 1578 1590- 


1591 








1597 


1599 


-1601 1612 1614 


1616 








1619- 


1620 


1622 1624-1626 


1628 








1631- 


1632 


1634 1536 1639 


1644- 








1645 


1648 


1651 1653-1656 


1658 








1660 


1662 


-1663 1667 1669 


1671 








1675 


1678 


-1681 1683-1686 


1689 








1691- 


1692 


1703 1709-1711 


1717 








1724- 


1726 


1729 1734 1737- 


1738 








1740 


1743 


-1744 1749 1753 


1759- 








1761 


1770 


1777 1786 






trachea 1 


CI on tech 


TRC001 


9 29- 


31 46 48 87 104 


107 


110 135 








158 222 262 266 286 301 318 331 
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Tissue Origin 



uterus 



RNA Source 



Hyseq 
Library Name 



Clontech 



UTR001 



SEQ ID NOS: 



352 372 377 384 414 424 445-446" 
454 472 474 491 496 560 579 588 
593 597 607 612 626 681 702 719 
810 859 866 B78 894-895 912 916 
922 932 935 1046 1075 1080 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1569 1579 1586 1600 1641 1653 
1667 1671 1676-1S77 1683 1691- 
1692 1711 1717 1726 1772 



17 19 25 41 46 57-58 61 89 104 ~ 
108 139 152 174 196 200-201 206 
263-265 274 290 387 408 420 438 
446 448 452 4 73 491 493 499 S03 
506 513 519 522 526 530 542-543 
560 601 610 632 659 665 720 751 
773 780 833 845 857 872 877 912 
923 934 937 996 1009 1011 1018 
1050 1075 1107 1124 1170 1219 
1258 1279 1287 1310 1320 1323 
1343-1344 1375 1437 1451-1452 
1478 1481 1498 1519 1521 1536 
1552 1579 1597 1602 1606 1620 
1626-1627 1649 1652 1661 1670 
1719 1722-1723 



TRADOCS: 1416191.1 (%CQN0 1 ! . DOC) 
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TABLE 2 





SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PROH14 protein 
sequence. 


1398 


100 


2 


Y666S6 


Homo 
sapiens 


Membrane -bound protein 
PR0943 . 


2389 


99 


T " — 


At 113136 


Homo sapiens 


IL.-1 receptor-associated- 
kinase- M; IRAK-M 


3043 


100 


4 


AFD17806 


Mus musculus 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


fibronectin precursor 


1053S 


98 


S ' 


X02761 


Homo sapiens 


fibronectin precursor 


8990 


89 


8 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


93 


10 


W88 501 


Homo sapiens 


Human stomach carcinoma clone 
HP1 04 15 -encoded protein. 


2381 


100 


11 


AP117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 ■ 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


15 


AF233453 


Homo sapiens 


RACK- like protein PRKCBP1 


3124 


99 


17 


AF201303 


Homo sapiens 


dhfr onbeta-binding protein 
RIP60 


"3130 


98 


18 


AF064205 


Homo sapiens 


dynactin 1 pl50 isoform 


6377 


100 


19 


U00059 


Saceharomyce 
s cerevieiae 


Yhrl21wp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanosine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+/calmodul in- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140507 


Homo sapiens 


Ca2+/calmodul in-dependent 
protein kinase kinase beta 


2300 


99 


24 

Tc 


AJ289131 


Homo sapiens 


chondroitin 4-o- 
sulfotransf erase 


2211 


99 


25 


U33460 


Homo 
sapiens 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 j 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 

To 


U43 701 


Homo sapiens 


ribosomal protein L23a 


751 


100 






U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 




lt9 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


99 




30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


715 


90 




31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


631 


82 




32 


AF231917 


Homo sapiens 


long -chain 2 -hydroxy acid 
oxidase HAOX2 


1811 


100 




33 
34 


Z29481 
AB001451 


Homo sapiens 
Homo sapiens 


3-hydroxyanthranilic acid 
di oyc\scs pnanp 

Sck 


1507 


99 




35 


Y00644 


Homo sapiens " 


precursor polypeptide (AA -34 
to 287) 


2869 
1667 


100 
99 




36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 




37 " 


Y78795 i 


Homo sapiens 


Human antifcuai-2 (AZ-2) amino 
acid sequence , 


3586 


78 




38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


4726 


99 J 
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SEQ 
ID 
NO i 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3747 


100 


41 


Y427S0 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1). 


795 


100 


42 


AP282626 


Homo sapiens 


latexin 


1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6231. 


384 


94 


44 


U19617 


Mus musculus 


Elf-1 


2724 


88 


45 


U19617 


Mus musculus 


Elf-1 


2062 


86 


46 


AF1O0758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


YB7591 


Homo sapiens 


Human SPROOTY-1 protein, SEQ 
ID NO:24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 garama precursor (aa -22 to 
160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


Rattus 
norvegicus 


rab-related GTP-binding 
protein 


1089 


96 


53 


L317B3 


Mus musculus 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


*6 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100" 


57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


6089 


99 


59 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


4014 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing cxxc 
domain 1 


1390 


100 


62 


Y66^60 


Homo 
sapiens 


Membrane - bound protein 
PR0783 . 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
norvegicus 


A- kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1308_1 clone 
secreted protein. 


157 


30 


67 


AJ245738 


Homo sapiens 


claudin-15 


1206 


100 


6 8 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4906 


86 


70 


282059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPAi protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimiri 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


-Jl 


AE000406 


Escherichia 
coli 


putative DNA topoi some rase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to s. cerevisiae 
ktil2 protein 


210 


31 


79 


AF129756 


Homo sapiens 


G4 


1554 


99 ] 
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SEQ 
ID 
NO: 



ACCESSION 
NUMBER 



SPECIES 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



80 



81 



"82" 



AL0 967 6 8 



Homo sapiens 



AL096768 



dJ8 58B16 .2 
(phosphatidylserine 
decarboxylase (FSSC, EC 
4.1.1.65) ) 



2033 



Homo sapiens 



dJ85BB16 .2 

(phospha t idyl ser i ne 
decarboxylase (PSSC, EC 
4.1.1.65) ) 



100 



96 



83 
84 
"85- 



XS73S1 



AC005594 
X73113 



Homo sapiens 



1-8D 



Homo sapiens 



R26984 1 



677 
2700 



98 



98 



AF097330 



Homo sapiens 
Homo sapiens" 



fast MyBP-C 

HI chloride channel; p64Hl; 



S959 



99 



"86" 



87 



88 



89 



90 



~9T" 



93 
~§4~ 



CLIC4 



1305 



AB018423 



Mus mus cuius 



AF2721S1 
AF196329 



Homo sapiens 



SH2 domain- containing protein 



1360 



adaptor protein CIKS 



AB016879 



Homo 

sapiens 

Arabidopsis 



3084 



triggering receptor expressed" 
on monocytes 1 



1214 



thaliana 



AJ133721 



AJ242864 



A61971 



Y99365 



Mus musculus 



Mus mus cuius 



unidentified 



Homo sapiens 



contains similarity to pre- 

mRNA splicing 

f actor-gene_id: MRB17 . 2 



634 



homeodomam protein 



phtf protein 



MCSP 

Human PR01250 (UNQ633) amino 
acid sequence SBQ ID NO:8S. 



TS4" 



619 



11676 



3 890 



99 



78" 



99 



100 



36" 



57" 



61 



99 



100 



95 



Y87231 



Homo sapiens 



AF227741 



Rattus 
norvegicus 



Human signal peptide" 
containing protein HSPP-8 

SEQ ID NO; 8 . 

protein kinase WNKl 



1031 



2428 



100 



95 



97 



99 



100 



AF227741 



Y92S13 
AL0213 66" 



Rattus 
norvegicus 



protein kinase WNKl 



1961 



Homo sapiens 
Homo sapiens 



Human OXRE-10. 



AC005733 



Y95293 
AL118S01 



Homo sapiens 



CICK0721Q.3 (Kinesin related 
protein) 



1626 



3423 



R33083 1 



Homo sapiens 
Homo sapiens 



Human GEF containing NEK- like 
kinase substrate sGNK. 



1974- 



4092 



94 



100 
100 



99 



101 



102 



103 



104 



105 
106 



AJ006267 



AF100753 



Homo sapiens 



dJil91N16.l (a novel protein 
(translation of the cDNA 
DKFZ p566A0946, Em: AIi050069) ) 



1509 



Homo sapiens 



ClpX^like protein 



AB015982 



AF151074 



Homo sapiens 



ancient ubiquitous 46 kDa 
protein AUP1 



3233 



2042 



serine/ threonine kinase 



Homo sapiens 



4718 



HSPC240 

GTP -binding protein <rab7) 



831 



100 



100 



96 

loir 



64 



"ToT" 



108 



M35522 



R99800 



Canis 
familiaris 



354 



Homo sapiens 



NTII-1 nerve protein, 



AP125533 



facilitates 
nerve cells. 



2337 



regeneration of 



Homo sapiens 



NADH-cytochrame bS reductase" 
isoform 



1290 



50 



93 



"Tio- 



iii 



112 



AF064729 



Homo sapiens 



F23269 2 



X52425 



Homo sapiens 



Y41686 



Homo sapiens 



RAN binding protein 16 



Homo 
sapiens 



interleukin 4 receptor" 



3369 
3285 



Human PR02 74 protein 
sequence. 



1496 



2285 



99 



100 



100 



100 



Homo 



sapiens 



Mitogen activating protein 
kinase ERKl . 



1991 



100 



Homo sapiens 



Human membrane transport 
protein, MTRP-16. 



1190 



99 



116 

11 7~" 



Homo sapiens 



AF189817 
W3 0891 



dJ398G3.1 (ortholog of rat" 
CPG2) 



3497 



Mus musculus 



evectin-2 ' 
Human cytostatin lit proTeirT: 



1124 



99 



90 



Homo 



715 



99 
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" 118" 


AF116618 


Homo sapiens 


"PRO1038 


1469 


100 


119 


Y08915 




alpha 4 protein 


1748 


100 


12C 


AF098070 


Drosophila 

nits J. ciiioy as C- c i 


Li si homo log 


192 


39 


121 


AF0S2432 


Homo sapiens 


katanin p80 subunit 


181 


37 




VT n*7/l "3 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


224 




Homo sapiens 


Human viral receptor protein 

(ACVRP) * 


833 


99 


125 


M63109 


Leishmania 
major 


glycoprotein 96-92 


172 


27 




u o / 


itiel anogas t e r 


Atu 


935 


36 


127 


Z68220 


Caenorhabdi t 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 




W929oo 


Homo sapiens 


Human zsig4 4 protein. 


463 


100 


130 


AF1153 91 


Lactobacilli! 
s sakei 


ribokinaoe RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


2i-Glutamic Acid-Rich Protein 


9l£ 


87 


133 


W52811 


Homo sapiens 


Human DBI/ACBP -like protein 
<DBIH> . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence or a 
human RNA- associated 
protein. 


3230 


100 


135 


M69181 


Homo sapiens 


non- muscle myosin B 


189 


20 


136 


W74 882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83 . 


480 


100 


137 


W7820O 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81. 


855 


99 


13 8 


wT ni^con 

/iuU J J3i U 


Homo sapiens 


dJ34 9A12.1 (similar to 
KIAA0701 protein) 


424 


39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


100 


141 


Y06439 


Homo s ap i ens 


Human protease HUPM-8. 


936 


100 


142 


Z68493 


Caenorhabdit 
xS elegans 


predicted using Generinder 


365 


42 


~143 


AB018107 


Arabldopsis 
thai i ana 


ADP-ribo3ylation factor-like 
protein 


596 


65 






Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabldopsis 

f~hal i ana ' 


F3F19.18 


647 


31 


14 8 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


AF056490 


Homo sapiens 


cAMP- specific 
phosphodiesterase 8A 


3710"" 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


7B5 


99 


151 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl46wp 


515 


53 


152 


X73478 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


ALQ49697 


Homo sapiens 


dJ382I10.5.l (novel protein 


2034 I 


99 



146 
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NUMBER 


SPECIES 
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SCORE 


% 

IDENTITY 








similar to arginyl- tRNA) 






1S4 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156" 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


1471 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibit or-2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


23 95 


100 


161 


W54040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ4i3H6.i.i (hamster 
Androgen- dependent Expressed 
Protein LIKE putative 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homo log 


193 


45 [ 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


463 


97 


165 


AJ2S0839 


Homo sapiens 


serine/ threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W83645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71. 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA helicase 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


Thi s gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205 


100 


176 


W9033S 


Homo 
sapiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel- related 
molecule HCRM-3 . 


1122 


100 


178 


Y41674 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Ciq B-chain precursor 


1240 


100 


181 


U57344 


Mus musculus 


Meis3 


1813 


89 


183 


U57344 


Mue musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AFO33120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 [ 


82 


187 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1188 


99 


188" " 


>iU jS J? 2. J 2. y 


Homo sapiens 


suppressor o£ sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 ■ 


Y22203 


Homo sapiens 


Human calcium- binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12 . 


1975 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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IDENTITY 


194 


AF084259 


Mus mus cuius 


bromodoma in - conta i n ing 
protein BP75 


693 


54 


195 


Y00752 


Rat t us 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70_7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


196 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_l. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 


Homo sapiens 


54 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO- 6142. 


558 


99 


203 


X1388S 


Nicotiana 
t aba cum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


208 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in Ati023803>) 


694 


54 


210 


AF226732 


Homo sapiens 


NPD0 07 


1345 


76 


211 


X66295 


Mus musculus 


Clq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiquit in-conjugating enzyme 
UbcH2 


966 


ioo ■ ~ 


213 


Z29328 


Homo sapiens 


Ubiqui tin-conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3933 


100 


216 


AF250558 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ82lDll.i { PUTATIVE protein) 


259 


100 


218 


Y08565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabidopsis 
thai i ana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


Bll 


41 


222 


AL109736 


Schizosaccha 
romyces 
pombe | 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224" 




Homo sapiens 


CUT979N1.1 (dJ97 9Nl.l) 


5199 


98 


225 


/U3 U .5 si <l U ± 


Mus musculus 


mmDj4 


1761 


92 


226 


AB032401 


Mus musculus 


mmDj4 


1988 


92 




X83 502 


Saccharomyce 
s cerevisiae 


J1007 


112 


26 


228 


X83502 


3 cerevisiae 


»J10Q7 


"to ' 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 j 


99 


230" 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


982 


100 


231 " 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human cycl in Bl . 


2218 


99 


234 


¥537^2 


Homo sapiens 


A GTP- binding polypeptide 


1017 


160 
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IDENTITY 








designated RAQ. 






23 5 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1800 


100 


236 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1754 


98 


23 7 


apn Tea qi 


Homo sapiens 


PICK1 


2137 


100 


«£ J O 




Entodiniurri 
cauda turn 


putative "~ 
phosphatidyl inositol -4 - 
phosphate 5 -kinase 


114 


37 


239 • 


AB030189 
— wcgcT a 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


93 


240 


N9bOJ o 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 






Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


343 6 


99 


242 


AF155107 


Homo sapiens 


NY -RES- 3 7 antigen " " 


996 


99 


243 


AF155107 


homo sapiens 


NY- REN- 3 7 antigen 


1005 


100 


24 4 


AL03132 0 


Homo sapiens 


dJ20N2.1 (novel protein 


763 


99 








similar to yeast and 
bacterial cytosine 
deaminase) 




24 5 


U3 7 0 2 6 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


ALQ78599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


2391 


98 


247 


U32274 


Saccharomyoe 
s cerevisiae 


Ydr386wp; CAI: 0.12 


191 


37 


248 " 


Y41719 


Homo 
sapiens 


Human PR0864 protein 
sequence . 


1079 


100 


24 9 


AB029434 


Homo sapiens 


ghrelin precursor 


611 


100 


250 


X97831 


Rattus 
norvegicus 


carnitine/acylcarnitine 
carrier protein 


24£ 


38 


251 

? ero 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 




Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 




5759878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM4 9) . 


765 


100 


"254 


AL354533 


Leishmania 
maj or 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


95 


256 


Y78113 


Homo sapiens ~ 


Human cytokine signal 
regulator CXSR-1 SEQ ID 
NO:l . 


2247 


99 


257 


AL035539 


Ar abidopsi s 
thai i ana 


putative amino acid transport 
protein 


390 


27 




W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 - 


1171 


100 


259 


AL035689 


komo sapiens 


CU187J11.1 (novel protein 
similar to protein kinase C 
inhibi tors ) 


974 


100 


260 


AE00O9OQ 


Methanobacte ■ 
rium 

therraoautotr 


serine/ threonine protein " " 
kinase related protein 


363 


30 






ophicum 








261 
262 


AL050131 
AF019661 


Homo s ap i en s 
Mus musculus 


hypothetical protein " ' 
zeta proteasome chain; PSMA5 


626 
1214 


100 
100 


263 
~264 

265 
r 266 


AL035593 
AL022318 

AF205940 


Homo sapiens 
Homo sapiens 

Homo sapiens 


cufciOJS.i (novel protein) 
bK150C2.3 (PUTATIVE novel 
protein similar to APOBECl) 
endomucin 


821 
1072 

1289 


100 
100 

100 


267 


ALQ235Q3 
AL034548 


Homo sapiens 
Homo sapiens 

] 

1 


dJ50OL14.1 (novel protein) " " 
dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


789 
1888 


100 " 
99 



149 
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I SEQ 

i 10 

I NO: 
|_26B 


ACCESSION 
NUMBER 

AF161470 


SPECIES 
Homo sapiens 


DESCRIPTION 

HSPC121 


SMITH- 
WATERMAN 
SCORE 
1884 


IDENTITY 
98 


[269 
f 270 

| 271 


AF161470 
X90763 

• AF207600 


Homo sapiene 
Homo 
sapiens 
~ Homo sapiens 


HHa5 hair keratin type I 
ethanolamine kinase 


1232 
2190 


96 
99 


272 
[273 


M32334 
AF161483 


Homo sapiens 


iiitctucAiuxor a one si on 

molecule 2 

HSPC134 


1952 
14 3 6 


100 
100 


[ 274 


Y53052 




Human secreted protein clone 
df2 02_3 protein sequence SEQ 
ID NOillO. 


663 
587 


61 
100 


276 


Y77S76 


Homo saoipna 


Human cytoskeletal protein 
(HCYT) (clone 2195418) . 


762 


100 


277 


AF077042 




30s rlbosomal protein S7 
homo log 


1269 


100 


278 


Y94907 


" Homo sapiens 


Human secreted protein clone 
cal06jL9x protein sequence 


1619 


98 


279 


Y68788 


Homo sapiens 


Amino acid sequence of a 
human pho sphory 1 a t i on 
effector PHSP-20. 


2801 


99 


280 


Z75134 


Canis 

f amiliaris 


rod transducin 


1816 


100 


281 

282 
J 283 


Z75134 

AF249873 
ALOS0007 


Canis 

familiar is 
Homo sapiens 
Homo sapiens 


rod transducin 

muscle-specitic protein 
hypothetical protein 


1718 

1395 
405 


96 
100 

98 " " 


284 
285 

| 287 
288 


AF201931 
AF156102 
Y3S897 

U88964 
AjL050143 


^£omo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo s ap i ens 


DC1 

ELXj complex EAP30 subunit 
Extended human secreted 
protein sequence, SEQ ID NO. 
146 . 
HEM4S 

hypothetical protein 


1859 
1318 

~i2so 

923 


99 
99 
99 

100 


1 289 
290 

| 291 


AJ0110S8 
Y66724 

AF034801 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


telethonin 

Membrane-bound protein 
liprin-alpha4 


598 

574 " 
2321 

2565 


100 
100 
100 

98 


292 
1 293 


AF034001 " 
AL049851 


Homo sapiens 
Homo sapiens" 


liprin-alpha4 

diJ889J22B.l {novel protein 

(isoform 1) ) 


2590 
1738 


100 
100 


294 
295 


Y73348 
L11672 


Homo sapiens 
Homo sapiens 


sequence. 


1245 


99 


296 


AL03 5423 ' 


Homo sapiens 


dJ20l3.1 (brain mitochondrial" 
carrier protein- I (BMCPl) ) 


1694 
1024 


44 

79 


(298 


AF198532 
AF161417 


Homo sapiens 
Homo sapiens 


lymphoid enhancer binding 
factor- l 

HSPC299 ~" "~ " 


2173 


100 


299 
300 


AF159141 


Homo sapiens 


breast cancer mecastasis^ 
suDDressor 1 


1147 

±224 


85 
99 


[ if)i - — - 


U26397 


Rattus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 


30 


1 302 


AF036145 
Z82022 


Homo sapiens 
Homo sapiens 


meningioma- expressed antigen 
GicNac-l-p transferase 


3458 


100 


303 
~304 


AF269232 


wus musculus 


butyrophil in-like protein 
BUTR-1 


2067 
271 


99 
50 




AJ222644 ] 


Arabidopsis " ~ 
thaliana 


asparaginyl-tRMA synthetase 


659 


50 


305 ; 

30$ i 


!VP054180 1 
^272079 1 


iomo~ 
sapiens 

-lomo sapiens i 


lematopoietic cell derived 
zinc finger protein 
\POBEC-l stimulating protein 


351 


79 


308 : 

309 I 


£44486 ] 

s 

U131891 [*1 


iomo i 

sapiens j 

fomo sapiens 1 


-fuman GPRW receptor 
polypeptide . 

5NA polymerase mu ~ ; 


3056 
L721 

>598 ~~: 


100 1 
LOO 

L00 H 
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IDENTITY 


310 


AF293335 


Homo sapiens 


p30 D8C 


1248 


92 


311 


AF176525 


Mus musculus 


F-box protein FBL12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z3671S 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3 a 


3046 


100 


316 


Y^6*6 


Homo 
sapiens 


Membrane- bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RApR-i. 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y63773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 ] 


321 


AJ238379 


Homo sapiens 


putative THl protein 


3013 


100 


322 


"AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 


323 


Y95013 


Homo sapiens 


Human secreted oroteln 
vc48_l, SEQ ID ~NO:66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


1976 


100 


325 


Y94 944 


Homo sapiens 


Human secreted protein clone 
bf!57_16 protein sequence 
SEQ ID NO: 94. 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7 sequence . 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Ml 1 G mi i c r^i line 


\ r mx v receptor variant x. 


4 84 


94 


330 


Z75330 


Homo 

>R6S207 
R65207 02- 
MAR- 1995 27" 
AUG-1993 
Human 

stromalin-i . 
[Homo 
sapiens 


nuclear protein SA-i 


6492 


99 


331 


AL006583 


Homo sapiens 


dJ327Jl6.3 (supported by 
GENSCAN, FGENES and GBNEWISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Kus musculus 


p53 -regulated DDA3 


997 


64 


33S 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the i 154 
Eimeria tenella gene etlOO } 


26 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-'53 
(Hs-UNC-53/l) sequence. 


3386 


97 


337 


Y8S564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence . 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 


98 


339 


~Z565<il 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 - 


84 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


IDENTITY 








SCORE 










VDJ region 






344 


U10281 


sua scrofa 


gastric mucin 


279 


24 


345 


AKOO04O4 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22557 


Rattus 
norvegicus 


calmodul in-binding protein 


1949 


84 


347 

i a a " — 


L22557 


Rattus 
norvegicus 


calmodulin- binding protein 


2363 


91 




AUQ49481 


Arabidopsis 
thaliana 


AIGl-like protein 


316 ' 


30 


350 


AJ251516 


Mus mus cuius 


cysteine and histidine-rich 
protein 


1460 


39 


351 


AK024477 


Homo sapiens 


FLJ00D7O protein 


1773 


100 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AF161420 


Homo sapiens 


HSPC3 02 


2623 


97 


3S5 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


3SS 


AF151029 


Homo sapiens 


HSPC19S 


941 


91 


357 


AL022327 


Homo sapiens 


dJ355C19.1 (KIAA0027) 


1911 


100 j 


3S8 


VJ78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


359 


X03414 


Drosophila 
tnelanogaeter 


Kr polypeptide 


316 


45 


360 


AF151079 


Homo sapiens 


HSPC245 


643 


100 


361 


Y53886 


Homo sapiens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 


530 


41 


362 


AF254741 


Drosophila 
melanogaster 


Centaur in Gamma 1A 


681 


46 


363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF181562 


Homo sapiens 


proSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 


proSAAS 


1624 


99 


366 


U73200 


Mus mus cuius 


pll6Rip 


8 64 


62 


367 


AF263744 


Homo sapiens 


erbb2- interacting protein 
ERBIN 


4973 


99 


368 


U37501 


Mus musculus 


laminin alpha 5 chain 


5B67 


72 


369 


AF043695 


Caenorhabdit 
is elpgans 


similar to the protein 
phosphates 2c family 


549 


36 


370 j 


Y73440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102. 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 
■j ni 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3927 


100 


j is 


Y73345 


Homo sapiens 


HTRM clone 436283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


formiminotransf erase 


2717 


98 








cyclodeaminase 




375 


A95106 


unidentified 


RED ALPHA 


1202 


99 


J / o 


W/4828 


Komo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQA352 - 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3S56 


99 


378 

•i -7Q 


M14912 


Homo sapiens 


pol 


132 


86 


j /if 


AF090934 


Homo sapiens 


PRO0518 -' " 


3 82 


100 


380 


X66363 


Homo sapiens 


serine/threonine protein | 
kinase 


2499 


100 


381 


Y41699 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-i specific protein 
phosphatase 


7008 


98 


383 


U64608 


caenorhaJbdit 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


244 


36 1 


384 
"385 


U50133 


Homo sapiens 


ankyrin 


502 


33 




AJ238520 


Homo sapiens ] 


putative transcription 
factor- like nuclear regulator 


4123 


97 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


387 


AF208845 


Homo sapiens 


BM-003 


1375 


99 


389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


Y8S564 


Homo sapiens 


Human homologue of UNC-S3 
(Hs-UNC-53/1) sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


161<? 


62 


395 


AF181721 


Homo sapiens 


RU2S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human beta IV- spectrin 
protein. 


1626 


98 


397 


U4 8238 


Mus musculus 


zinc finger protein neuro-d4 


749 


60 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase ; similar to 
Q02218 <PID:gl352618) 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


1024 6 ' 


62 


403 


AL133288 


Homo sapiens 


dtf671D7.1 (similar to 
D. melanogaster CG5986 
protein) 


761 


XUU 


404 


Z68753 


Caenorhabdi t 
is elegans 


ZC518.3b 


888 




405 


Z78013 


Caenorhabdi r. 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY-REN- 3 6 antigen 


1168 


100 ' 


408 


Y5794 5 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


trichohyalin 


184 


30 j 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 


AL031^5& 


Homo sapiens 


dJ310O13.7 (novel protein 
similar to H. roretzi HRPET- 
3) 


776 


if a 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Homo sapiens 


3 - me t hyl c ro t onyl - CoA 
carboxylase biot in -containing 
subunit 


2961 




416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmania 
major 


possible t26fl7.21 


239 


3S 


418 


Y08100 


Homo sapiens 


Human PR0331 protein. 


330 


29 


419 


U15131 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


LlnJc guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


anxyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL13753 0 


Homo sapiens 


hypothetical protein 


433 


"94 - 


424 


X63 753 


Homo sapiens 


son-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF2 79144 


Homo sapiens 


tumor endothelial marker 7 
precursor j 


1084 


55 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


427 


AF279144' 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


- 56 


~4?8 


AE003683 


Drosophila 
melanogaster 


CG8312 gene product 


149 


2 9 


429 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 


430 


AF096897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


37B3 


100 


433 


AF146760 


Homo 
sapiens 


septin 2-liJce cell division 
control protein 


2284 


100 


434 


AB006697 


Ar-abidopsis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein- like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP . 


1704 


100 


43B 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl trans f era 
se 


1075 ' 


63 


439 


AF105228 


Bos taurus 


tuftelin 


285 


33 


440 


R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 40553) . 


3073 


99 


441 


X14971 


Mus musculua 


alpha-adaptin (A) (AA 1-977; 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha- c large chain (AA l- 
938) 


3979 


81 


443 


Y66689 


Homo 
sapiens 


Membrane- bound protein 
PR0113 6. 


3299 


99 


444 


AC067754 


Arabidopsis 
thaliana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus itrusculus 


piL 


2 077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexiiin 


2662 


85 


447 


AF132484 


Mus mus cuius 


unknown 


4 78 


51 


448 


W69024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


44 9 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


450 
4S1 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


951 


49 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3. 


155 


32 


452 


W85727 


Homo 
sapiens 


Novel protein (clone 
BM46_10) . 


2?99 


99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


4 54 


D87438 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 • 


AF240468 


Homo sapiens 


nlcastrin 


3687 


100 


456 


£15005 


Homo sapiens 


CENP-E 


13305 


99 


457 


MS9216 


Horoo 
sapiens 


gamma-aminobutync acid 
receptor beta-l subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd6l_i protein sequence SEQ 
ID NO: 156. 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLFM29. 


535 


100 


460 


AF163151 




dentin sialophosphoprotein 
precursor 


279 


19 


461 


D8744 6 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 

463 
464 
465 


004044 

AC002398 
AF064856 
AF223408 


Homo sapiens 

Homo sapiens 
Rattus op. 
Homo sapiens 


Human secreted protein, SEQ 

ID NO: 8125. 

F25965 1 

7acomp protein 

B99 


486 

1018 
1645 
3686 


93 

100 

84 

99 
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SEQ 
ID 
NO: 


1 ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


1 % 
IDENTITY 


466 


| AF223408 


Homo sapiens 


-"B99 


2878 


j 87 


"46-7 


1 AF104415 


Mua musculus 


gene trap locus- 13 


6336 


I 91 


468 


U534 50 


Rattus 
norvegicus 


aun dimerization protein 1 
JDP-l 


196 


49 


469 


AL031297 
I AF257077 


Homo sapiens 


dJ97P20.1 (novel gene) 


3564 


J 99 


470 




Homo sapiens 


euka r yo t ic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


beta transducin-lilce protein 


284 


"3 8 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AP144237 


Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX3 9. 


838 


I 100 


475 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO:52. 


3411 


I 100 


476 


j D38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAKl-binding protein 2 


3656 


100 


478 


AL031534 


Schi zosaccha 

romyces 

pombe 


putative asparagine synthase 


482 


40 


479 


L28125 


Podospora 
anserina 


beta transducxn-like protein 


233 


[26 


480 


1 AF161544 


Homo sapiens 


HSPC059 


434 


1 77 


481 


AJ238248 


Homo sapiens 


centaurin beta2 " 


3986 


1 99 


482 


Z38061 


Saccharomyce 
e cerevisiae 


malS, atal, len: 1367, CAI : 
0.3, AMYH_YEAST P08640 
GLUCOAM YXjAS E SI (EC 3.2.1.3) 


295 


[23 


463 


AF161381 


Homo sapiens 


HSPC263 


1404 


' 100 " 


484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X57S27 


Homo sapiens 


alpha l(VTII) collagen 


4166 


99 


487 | 


Y19062 


Homo sapiens 


39k3 protein 


2475 


100 


488 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence . 


555 


56 


489 


AL021918 


Homo 
sapiens 


J034IB.1 (Kruppel related zinc 
Finger protein 184) 


4184 


100 


490 


X53773 


Rattus 

norvegicus j 


alpha- c large chain (AA 1- 
938) 


4675 


97 


'491 " 


U52426 


Homo sapiens 


GOK 


1459 j 


"59 


492 ] 

493 j 


AL359773 


Leishmania 
major 


possible threonine synthase 


702 1 


45 




AF22*614 


Homo sapiens 


ferroportinl 


2929 


100 


494 


Z93241 


Homo sapiens 


dtf222B13.1 (novel protein 
with some similarity to 
Drosophila kkaken) 


513 


96 


495 j 


AF036977 


Homo sapiens 


unknown ~ " 


1812 j 


100 


496 | 


U93564 


Homo sapiens 


p40 


133 1 


45 


497 j 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO:126. 


357 


100 


498 


AF069781 


Drosophila 
melanogaster 


Bem46-liJce protein 


653 


43 


499 j 


Y16601 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECYP-2. 


1658 


98 


500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 

502 j 

503 J 


AF027503 
AF282874 


Mus 

musculus 
Homo sapiens 


putative membrane -associated 
guanylate kinase 1 
nectin 3? PRR3 


205 
2856 


36 
99 


504 j 
505 
^507 | 


AJ249732 
AF208861 
L09708 


Homo sapiens 
Homo sapiens 
Homo sapiens 


G8 protein 
BM-019 

complement component C2 


669 

1629 

4022 


100 

100 1 
100 


508 


JC66285 

D00189 1 
i 


^ue musculus ! 
Rattus ] 
lorvegicus 


HC1 ORF 

}fa+ , K+-ATPase alpha- subunit 


115 [ 
5227 | 


43 
99 
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TABLE 2 



SEQ 
ID 

NO: 



ACCESSION 
NUMBER 



SPECIES 



DESCRIPTION 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



TUT 



Y94971 



Homo sapiens 



AB019036 



Homo sapiens 



Human secreted protein clone 
fai7i_i protein sequence SEQ 
ID NO: 14 8. 



217€T 



beta-1,4 mannosyl transferase 781 



100 



77 



512 



Homo sapiens 



AB019038 



Homo sapiens 



beta-1,4 mannosyl transferase 1347 



beta-1,4 mannosyltransf erase 1520 



100 



99 



513 



514 
515 



X84908 



XS28S1" 



Homo sapiens "phosphorylase kinase | 57H T 

Homo sapiens | pep t idyl prolyl isotoerase j 650 



99 



76 



-sir 



517 



518 



520 



-521" 
522 



AF186084 



Homo 
sapiens 



G03602 



Homo sapiens 



epidermal growth factor 
repeat containing protein 



3046 



U04706 



Human secreted protein, 
ID NO: 7683. 



SEQ 



505 



G00653 



Bos taurus 
Homo sapiens 



AF161475 



Homo sapiens 



Y99366 



Homo sapiens 



SOkDa protein 



Human secreted protein, SEQ 
ID NO: 4734. 



1749 



530 



HSPC126 



Human PR014 75~TUNQ746) amino 
acid sequence SEQ ID NO: 88. 



1368 



3394 



99 



99 



77 



106- 



100 



97 



AF266B52 



Homo sapiens 
Archaeoglobu 



PTPIiA 

chromosome segregation 



1295 



100 



AE000995 



sfulgidus 



protein (smci) 



153 



20 



AF0S2249 



Homo sapiens 



immunoglobulin heavy chain | 605 

variable region 

ARE1 I 2950 



97- 



525 



-526- 



AJ223830 



W01535 



Rattus ~ 
norvegicus 
Homo sapiens 



AF145658 



Drosophila 
melanogaster 
Homo sapiens 



Cellular homologue of the 
SV40 large T antigen. 



127$ 



BcDNA.GH10229 



320 



98 



83 



33 



523 



-S29** 



530 



AF112213 



D49387 



putative Rab5- interacting 
protein 



524 



Y30819 



Homo 
sapiens 



Homo sapiens 



NADP dependent leukotriene b4 
1 2 - hydroxydehydrogena se 



1616 



AL079335 



Homo sapiens 



Human secreted protein 
encoded from gene 9 . 



328 



dJ132F21 



JT73 (72.1 KDa protein 
(DKFZP564A03 2 , SBBI88) 
similar to mouse I FN -gamma 
induce MG11. J 



1059 



79 
100 



32 



99 



532 



533 



534 



535 



536 



538 



539 



540 



541 



542 
-543- 



-545" 



Hotno sapiens 



X76116 



Caenorhabd i t 
is elegans 



Human secreted protein | I is 3 

sequence encoded by gene 56 
SEQ ID NO: 179. 

carrier protein <c2) | 576 



X76116 



X12966 



Caenorhabdit 
is elegans 



Homo sapiens 



Y09267" 



Homo sapiens 



Z11773 - 



D84224 



Homo sapiens 



D84224 



Homo sapiens 



D84224 



Homo sapiens 



D84224 
J03244" 



Homo sapiens 



Homo sapiens 



Bos taurus 



Y92514" 



AF221712 



Homo sapiens 



AE000919 



Homo 
sapiens 



A06669~ 



Methanobacte 
rium 

thermoautotr 
ophicum 



synthetic 
construct 



carrier protein (c2) 



506 



3-oxoacyl-CoA tkiolase 
propeptide (424 AA) 



1972 



flavin-containing 
monooxygenase 2 



2486 



SRE-ZBP 



methionyl tRKA syntheta"se~ 



2201 



methionyl tRNA synthetase"" 



4741 



"methionyl tRNA synthetase 



3887 



athionyl tRNA synthetase 



2933 



H+ ATPase 3lkDa subunit (EC 
3.6.1.3) 



4529 



848 



Human OXRE-ll. 



Smad- and 01 f- interacting 
zinc finger protein 



2301" 



2151 



conserved protein 



207 



"preTGF-betal 



2070 



98 



50 



50 



100 



100 



99 



99 



99 

96" 

T9~ 



77 



99 



61 



99 



156 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



SEQ 
ID 
MO: 


ACCESSION 
NUMBER 


SPECIES 


1 DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 


54 6 


Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 


98 


CAT 


AF112205 


Homo sapiens 


WSB-1 protein 


2275 


100 


548 


X60271 


Mus musculus 


c-rel 


2264 


74 


""549 


AC016827 


Arabidopsis 
thaliana 


putative GTPase 


810 


42 


5S0 


Y70400 


Homo 
sapiens 


Human cell- signalling 
protein- 2 . 


429 


68 


551 


A3048365 


Homo sapiens 


NEDD4-like ublcuitin ligase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 


1112 


95 


553 


AF1198SS 


Homo sapiens 


PR01847 


265 


67 


554 


Ml 723 6 


Homo sapiens 


MHC HIA-DQ alpha precursor 


1332 


100 


555 


AL078468 


Arab idop sis 
thaliana 


putative protein 


540 


40 


556 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
<PlD:g46S0844) 


515 


44 


557 


AK024487 


Homo sapiens 


FLJ00086 protein 


1623 


98 


558 


M12140 


Homo sapiens 


pol gene protein; Xxx 


117 


48 


559 


W74 825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 


225 


56 


560 


X56S81 


Homo sapiens 


junD protein 


373 


88 


561 


AF0D313 6 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2926 


54 


562 


AL13983 9 


Homo sapiens 


dJ1069P2 .3.1 (novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BcDNA . GHO 9817 


289 


42 


564 
c 


AF052723 


Feline 

leukemia 

virus 


gag-pol precursor polyprotein 
gPr80 


154 7 


43 


ob b 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28817 


Homo sapiens 


pt326 4 secreted protein. 


3338 


100 


56 7 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


569 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3603 


93 


570 
S71 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3951 


99 




AL032821 


Homo sapiens 


CLJ55C23.1 (vanin 1) 


1821 


98 


572 


M69181 


Homo sapiens 


non-muscle myosin B 


7350 


99 


573 


M69181 


Homo sapiens 


non- muscle myosin B 


7311 


98 


574 


Y59678 


Homo sapiens 


Secreted protein 108-008-5-0- 
E6-FL. 


772 


100 ~~ — 


575 


AL36S234 


Arabidopsis ! 
thaliana 


putative protein 


788 


40 


576 


AL3 65234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 


X06745 


Homo sapiens 


DNA polymerase alpha -subunit 
(AA l - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


100 


579 

con 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 ■ 


100 


SOU 


AF165124 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


99 


581 
582 


W88812 
U82319 


Homo sapiens 
Homo sapiens 


Polypeptide fragment encoded 

hv npnp c: p 

novel 0RF 


2339 


99 


["583 
584 


P92219 
AJ223948 


Homo sapiens 
(human) 
Homo sapiens 


CR1 protein. 
RNA helicase 


342 
11425 


100 
99 


585 


Y08612 


Homo sapiens 


8 BkOa nuclear pore complex 
protein 


6608 
3874 


99 
99 


586 
587 


Y42384 

^F129756 I 


Komo 4 
sapiens 

^omo sapiens 3 


Amino acid sequence of 

Iv3l0 7. 

3AT4 


1007 
1873 


37 
98 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 

NO: 


1 ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


588 


AF13177S" 


Homo sapiens 


Unknown 


1929 


99 


589 


AJ250865 


Homo sapiens 


TESS 2 


2348 


100 


591 


Z98885 


Homo sapiens 


dJ522J7.2 (bromodomain- 
containing l (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


""1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


X56807 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AL13 7802 


Homo sapiens 


dJ798Aio.i (novel protein) 


212 


55 


596 


AL022329 


Homo 
sapiens 


DK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 




598 


AJ278112 


Homo 
sapiens) 
>Y49635 
Y4 9635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein. 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 


1574 


99 


600 


L36531 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 I 


601 


Y384S8 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


695 


100 


602 


AF218584 


Homo sapiens 


GGAl 


3265 


100 


603 


Y13115 


Homo sapiens 


serine /threonine protein 
kinase 


5071 


99 


604 


AL132776 


Homo sapiens 


dJ393D12.1 (KIAA0776) 


2413 


99 


605 


AL034452 


Homo sapiens 


dJ6B2Jl5.1 (novel Collagen 
triple helix repeat 
containing protein) 


1979 


100 


606 


Y14494 


Homo sapiens 


araiarl 


3465 


99 


607 


AJ001981 


Homo sapiens 


OXA1I, 


2603 


100 


608 


XS6098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


306"9 


100 


610 


AF163 5 72 


Homo sapiens 


Forssman glycol ip id 
synthetase 


1865 


99 


611 


AF161503 


Homo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ensis minor 


nuclear protein 


345 


-3 0 


613 


Y919S4 


Homo sapiens 


Human cytoskeieton associated 
protein 9 (CYSKP-9) . 


3*68 


100 


614 


AL022327 


Homo sapiens 


dJ355ClB.l (KIAA0027) 


361 


94 


615 


X85786 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo s ap i en s 


kinesin-2 


3487 


"99 


617 


D12644 


Mua musculus 


K^F2 protein 


3609 


97 


618 


U28789 


Mus musculus 


PACT 


5936 


89 


619 


Y35914 


Homo sapiens 


Extended human secreted 


1684 


99 








protein sequence, SEQ ID NO. 
163. 




620 


A3046382 


Mus musculus 


testis-abundant linger 


199 


23 








protein 




621 


Y00062 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


99 


622 


AF0682 86 




HL/V_i*iJJ j up 


861 


100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 


3734 


99 








dehydrogenase precursor 




625 


S 5 8544 


Homo sapiens 


75 kda infertility- related 

sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


X1496"8 


Homo sapiens 


Rll-alpha subunit (AA 1-404} 


2079 


100 


"628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7 1 derived protein 


1983 


100 
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TABLE 2 



SBQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITk- 
WATERMAN 
SCORE 


% 

IDENTITY 


629 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1694 


100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type VII 


1754 


100 


631 


AL034S5S 


Homo 
sapiens 


dJl34019.3 (zinc linger 
protein 151 (pHZ-67)) 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


223* 


. 100 i 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


63S 


X66357 


Homo sapiens 


serxne/threonine protein 
kinase 


1589 


100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


" 98 


637 


AB004884 


Homo sapiens 


PKU-alpha 


3718 


99 


638 


AJ0.02303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


ACT002303 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


similar to a c.elegans 
protein encoded .in cosmid 
T26A5 . 


26"7* 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AF119900 


Homo sapiens 


PR02822 


185 


-?6 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi- 2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homo log 


8 Z> 7 




649 


AF236061 


Oryctolagus 
cuniculus 


RING- finger binding protein 


3 330 




650 


AL034553 


Homo sapiens 


dJ9l4P20.2 (KIAA0784 protein 
similar to Mus musculus 
act ivi ty- dependent 
neuroprotective protein 
(Adnp) > 


5708 


100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha l 
subunit 


2386 


99 


654 


AC004 614 


Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 


655 


Y579Q8 


Homo sapiens 


Human transmembrane protein 
HTMPN-32. 


608 


99 


656 


234975 


Homo sapiens 


ldlCp 


3 733 


100 


658 


AL050306 


Homo sapiens 


dJ475B7.2 (novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 I 


100 | 


662 


AJ242954 


Mus mus cuius 


dysferlin 


4752 


59 


663 


AF18231S 


Homo sapiens 


myoferlin 


6232 


99 


££5 
667 


AL161516 
X59303 


Arabidopsis 
thai i ana 
Homo sapiens 


hypothetical protein 
valyl-tRNA synthetase 


209 


30 


668 


Y133SS 


Homo sapiens 


Amino acid sequence of 
protein PRO220 . 


3393 
3692 


"99 
100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 
be ta-N-ace tylglucosaminidase 
gene 


611 


S2 


671 


X56123 


rcus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCPll 


806 


42 


674 


AF229633 


mus musculus 


groucho-related protein 4 


4053 


99 


675 


L14463 


Rattus 


■ trans due in 


3619 


92 
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TABLE 2 
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ID 
NO : 


NUMBER 






SMITH- 
WATERMAN 
SCORE 


~ % 
IDENTITY 






norvegicus 








6"76 


-fA V«# V v J f J / 


4 1 IU DCtUiCUD 


R3 2611 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog«pol {retroviral 
element } 


252 


65 


678 


AF271388 


Homo sapiens 


CMP -N- acetyl neuraminic acid 
synthase 


2273 


100 


679 


X79066 




ERF-l 


1783 


100 


680 


AF118566 


Mus musculus 


hematopoietic zinc finger 
protein 


769 


50 


box 


VC1 AT Q 
131413 


Homo 
sapiens 


tinman i.ri 1 r? h\mo nVofi^ 

nuiiicin wiiu type p^co j .» 

protein. 


~26~2i 




682 


AL133S45 


Homo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


bo J 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb34l protein 

b U J I U» * 


3000 


QQ 


684 


Y94952 


Homo sapiens 


Human secreted protein clone 

•^Vil 1 C\ 11 riTnfpi n earntenoo 
SEO ID NO -110 


354 


98 


685 


AT.02 1 87B 




rf»T7S"7T70 A ( hTaTQrr'i i on 
factor 20 (AR1 i (KXAA0292) 
(isoform 2) ) 


154 


67 


666 


AE000198 


Escher i chi s. 
coli 


orf , hypothetical procein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin X 


3730 


99 


688 


AF039697 


Homo sapiens 


antigen NY-CO- 31 


508 


98 


689 


wU^O 33 


cuni cuius 


gamma subunit 


2356 


99 


690 


MJC J. 3 3 1 1/ D 


Pnmr*i eani &nct 

nuniu sapxcuB 


NY-RRN-lfi «nt--icrf»n 
i»i ivciii jo on l xycii 


265 


50 


691 


AC004774 


Homo sapiens 


DlX-S 


1542 


100 


692 


X90530 




ragB 


192£ 


99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 




A>13JU 


Homo sapiens 


ragB 


1 con 


03 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 

TI5 NO- Sfi44 


330 


100 


696 




thai iana 


Pu £ at *i mo t* li *i i 

* uuaLrnk vs^ iiicLiiiv/iiiiic 

aminopept idase 


669 


52 


697 


AJ25042S 


Rattus 
norveg i c us 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
ce 1 1 care i noma - 1 


5364 


99 


699 


Y994 01 


Homo sapiens 


Human *>ROl327 (UNQ<J87) amino 
acid sequence SEQ ID NO: 218. 


138$ 


100 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


6705 


100 


702 


X83 573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP - 2 rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodul 1 n - 1 ilee 
protein, Zchml . 


16"97 


94 


705 


Y71262 


Homo sapiens 


Human chondromodul in - 1 ike 
protein, Zchml . 


1736 


99 


706 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01S71 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


111 


99 


710 


Y08698 


Homo sapiens 


ranbp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2. 


754 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


712 


U93574 


Homo sapiens 


putative pl50 


799 


59 


713 


ACO 0 4531 


Homo sapiens 


Gene with similaity to DEAD 
box heli cases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y9217S 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2 . 


734 


98 


716 


AL137013 


Homo sapiens 


bA311P8.3 (probable uracil 
phosphoribosyltranf erase) 


862 


100 


717 


AB035123 


Mus mils cuius 


GDI alDha/GTla aloha /GOlb 
alpha synthase 


1696 




718 


Y96290 


Homo 5.P40254 
P402S4 25- 
OCT-1984 09- 
APR-1983 
Human IgD . 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07 595 


numo sapiens 


cranscnpcion raccor irXJJrt 


2373 


100 


72 2 


W 4 1 5 6 5 


Homo 

>W41S64 
W41564 08- 
OCT-1997 05- 
APR-1996 
Human 
calpain. 

[Homo 
sapiens 


Human calpain. 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC078 


1097 


90 


724 


AP187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdit 


contains simlarity to 
ddi_i_iicix. ouiy ces ccLcviBiae pre — 
fnRNA Bnlicincr nrot"pi n prdti 
<GB:Z72876) 


1143 


46 


72<J 


AC006708 


Caenorhabdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre— 
mRNA splicing protein PRP31 
(GB:Z72876) 


988 


46 


727 


AC024818 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-beta repeat), ecore-81.8, 
E=.1.4e-20, N«3 


950 


44 


728 


AJ005B97 


Homo sapiens 


JM5 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


908 


97 


73 0 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


On c orhyn c hu s 
ma sou 


GTP -binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


862 


97 


733 ■ 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


"97 


734 


AC024 813 


Caenorhabdi t 
is elegans 


Hypothetical protein 
Y54FlOAL.a 


152 


24 


735 


AL0354 61 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohoT " 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


UO0033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 



161 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


i SMITH - 
WATBRMAN 
SCORE 


% 

IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


. 2793 


100 


739 


AJ133115 


Homo sapiens 


TSC-22-like protein 


2054 


99 


740 


X98258 


Homo sapiens 


M- phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 


Caenorhabdi t 
is elegans 


strong similarity to the YPT1 
sub- family of RAS proteins 


960 


85 


743 


X76057 


Homo sapiens 


phosphomannose i some rase 


2191 


100 


744 


G03209 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7290. 


496 


98 


745 


X97064 


Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 


Sua scrofa 


follistatin A 


190$ 


98 


749 


AJ249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC0044IO 


Homo sapiens 


foa39554_l 


2094 


100 


751 


AF074968 


Homo sapiens 


P47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


riboeomal protein L3 9 


160 


77 


755 


AB008430 


Homo sapiens 


CDEP 


142 


29 


758 


1*32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 ~ 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


£25 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


histone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar tc a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


568 


38 


766 


AL023 82 8 


Caenorhabdi t 
is elegans 


Y17G7B.14 


200 


27 


767 


Y82777 


Homo sapiens 


Human chordin related protein 
{Clone dw665 4) . 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3). 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hBRRl (AA 1- 
521) 


2641 


97 


771 


AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap2 


935 


100 


773 


Z12173 


Homo sapiens 


N-ace tylglucosamine - 6 - 
sulphatase 


2970 


100 


774 


Y91950 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


776 j 


AL023799 


Homo sapiens 


dJ322P7.l (2inc finger) 


855 


56 


777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


778 


G01880 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5961. 


84 9 


98 


779 


AJ012590 


Homo sapiens 


glucose I- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


dJ130E4.2 (KIAA0796) 


1321 


68 


781 '■ 


Z7595£ 


Caenorhabdi t 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 j 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

mus cuius 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


64 9 


95 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




31*1.1. i ft — 

WATERMAN 
SCORE 


% 

T riPMT T T» V 

x j_/imn nix 








ID NO: 7954 . 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- as socia t ed 
protein. 


2074 


100 


786 


Y00918 


Homo sapiens 


Human Rab protein, RABP-1, 
protein sequence. 


1048 


q a 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


154 8 


99 


78B 


AB035384 


Homo sapiens 


SRp2 5 nuclear protein 


962 


V H 


789 


AF024S31 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006720 


Rattus 
norvegicus 


phospha tidylinosi tol 3 -kinase 


4508 


Q 7 
Zf t 


792 


V00638 


bactenophag 
e lambda 


reading £rame ealO 


600 


10 0 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desraoglein 2 


4810 


99 | 


796 


Y76884 




ftCU ■L-llVlJj-a.Zi LOllla D XliCIX ily 

protein- 7sequence . 


508 0 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372" 


37 


798 


U97189 


Caenorhabdit 
is elegans 


strong similarity to thw 

P13/P14 f ami 1 v r»F Vi nacAc 


227 


26 


799 


AF112201 


Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


A£234?6"5 


Rat tus 
norvegicus 


serine -arginine-rich splicing 
ioyuiaiuty protein otCKFoo 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13 -like 

T>T*c*it" pin 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


Z81097 


is elegans 


Similarity to Human 

tt:i,inoiJiaaLOina-OinQing 

crotein RBAP4fi vlffifio^io a 
comes from fcHi <=i er?»n#» 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194 . 


496 


98 


805 


AL121673 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


1C0 


806 


AC013483 | 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


3C 


808 


AB013885 


Homo sapiens 


beta -ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 




99 


810 


AF161421 


Homo sapiens 


HSPC3 03 


2134 


96 


811 


AF261689 


Homo sapiens 


subunit 




100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


273497 


Homo sapiens 


cU240C2.2 (Core histone 
H2A/H2B/H3/H4) 


324 


inn 


814 


W87689 


Homo 
sapiens 


Human HTXFT19 polypeptide. 


1484 


99 


815 " 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660J2, partial CDS 


865 


97 


821 


G03 951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 """ 


L34807 


Musca 
domes tica 


transpoaase 


174 


20 


823 


G02928 
Z99531 1 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 




scnizosaccha 


catteine- induced death 


184 " 


29 - - " 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
1 NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


k 1 

IDENTITY 

J-J-* 1 itt 1111 






romyces 
pombe 


protein 1 






325 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


693 


68 ~j 


826 


U23037 


Oryctolagus 
cuni cuius 


eIF-2Bepsilon 


3406 


90 1 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


628 


Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 j 


829 


Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379 . 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 


1264 


99 \ 


832 


AB011542 


Homo sapiens 


MEGF9 


2097 


100 j 


833 


GQ2639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


223 


70 ( 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 | 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 


AP119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1448 


94 j 


037 


X12517 


Homo sapiens 


C protein (AA 1-159J 


918 


100 j 


838 


U32865 


Drosophila 
melanogaster 


linotte protein 


164 


24 j 


839 


AF067730 


Homo sapiens 


TIjS-associated protein TASR-2 


631 


56 j 


840 


U27831 


Homo sapiens 


striatum-ennched phosphatase 


2840 


go j 
70 ] 


841 


AF286366 


Homo sapiens 


CamKI-like protein kinase 


1796 


1 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6390. 


27Q 




843 


AE063615 


Drosophila 
melanogaster 


ade3 gene product 


113 


4 8 I 


844 


G013S0 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: S431. 


629 


100 


845 


U27 838 


Mus raus cuius 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 


3305 


9* 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


AF164794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 | 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 ' 


850 


AF192 784 


Homo sapiens 


makorin 1 


2062 


97 


851 


. Y58S28 


Homo sapiens 


Protein regulating gene 
expression PRGE-21. 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 j 


854 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443 . 


330 


96 1 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443 . 


203 


100 ) 


856 


AF285118 


Homo sapiens 


CGI-203 


452 


100 ] 


857 


ACO06O69 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specif ity 
factor 


1383 


55 


858 


ALQ21546 


Homo sapiens 


Cytochrome c Oxidase 
Polypeptide Vla-liver 
precursor (EC 1.9.3.1) 


593 


100 J 


859 


L029S6 


Xenopus 
laevis 


ribonucleoprotein 


1664 


65 


860 


AF201947 


Homo sapiens 


MEK binding partner 1 


616 


100 j 


36Z 


L31783 


Mus musculus 


uridine kinase 


126* 


92 1 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 ] 


863 
"864 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 




ftF154108 


Homo sapiens 


tumor necrosis factor type i 


3559 1 


99 J 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


1 % 

I IDENTITY 








receptor associated, protein 






865 


AE001530 


Helicobacter 
pylori J99 


putative 


230 


32 


866 


X57807 


Homo sapiens 


immunoglobulin lambda light 
chain 


£99 


91 


867 


AL031673 


Homo sapiens 


dJ694B14.1 { PUTATIVE novel 
KRAB box protein with 18 C2H2 
type Zinc finger domains) 


4066 


99 


868 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


[ 100 


869 


AF192968 


Homo sapiens 


high- glucose -regulated 
protein 8 


3041 


99 


870 


AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


1 59 1 


871 


AL031427 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


| 100 


872 


AF151534 


Homo sapiens 


core histone macroH2A2 . 2 


1864 


1 100 


873 


AL021331 


Homo sapiens 


CIJ3 66N23.1 (putative C. 
elegans UNC-93 (protein 1, 
L.*tbt ii . 2; iviKE protein) 


1129 


100 


074 


X14606 


Homo sapiens 


propionyl-CoA carboxylase 


3579 


100 


875 


AL117334 


Homo sapiens 


CU687F11.1 (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em:AL110249) ) 


306 


100 


876 


X79489 


Saccharomyce 
a cerevisiae 


E-925 protein 


446 


35 


877 
878 


Y53 001 
AF2^i0^4 




Human secreted protein clone 
dn834 l protein sequence SEQ 
ID NO: 8. 


811 


100 


879 


X79417 


Sus scrofa 


40S ribosomal protein S12 


957 
687 


[ 100 

fioo 


880 


AF001317 


Saccharomyce 
s cerevisiae 


Soilp 


478 


28 


881 
882 


Y87275 
M14036 


Homo sapiens 
Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO: 52 . 
Cl-inhibitor 


2547 


100 


883 
884 


AB041261 
AF020313 


Homo sapiens 
Mus musculus 


calcium- independent 
phospholipase A2 
proline -rich protein 48 


598 
2903 


77 
100 


885 


Y10936 


Homo sapiens 


hypothetical protein 


999 

1104 | 


84 
99 


886 


AF073997 


Mus musculus 


myotubularin related protein 
1 


866 


36 


887 


Y57893 


Homo sapiens 


Human transmembrane protein 
HTMPN-17. 


1099 j 


94 


888 
"889 


AL117635 


Homo sapiens 


hypothetical protein 


929 | 


99 




AF210317 


Homo sapiens 


facilitative glucose 
transporcer ramiiy member 
GLUT 9 


2046 | 


99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO 
416. 


192 


57 


892 
893 


AF237631 
AF090929 


Homo sapiens 
Homo sapiens 


ubiquitous tropomodulin 
PR00477p - " 


1798 


100 


894 

895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 

BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


553 j 
3196 


99 
100 


896 j 


AL031228 " 
«\F2 71102 J 


Homo sapiens 

J 

-iomo sapiens 


dJ1033B10.2 (WD4 0 protein 

BING4 (similar Co S. 
cerevisiae YER082C, M. sexta 
WG10 and C. elegans F28D1.1) 
retinal degeneration B beta 


2825 j 


'96 


897 \ 


*E003551 1 
t 


orosophila < 
nelanogaster 


JU18176 gene product " 


1302 1 
S33 


95 
33 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


898 


ACT237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


899 


Z97184 


Homo sapiens 


KKE2 


624 


100 


900 


Z97184 


Homo sapiens 


KKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosopiiila 
melanogaster 


CG10984 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W84085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 i 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 




G03 162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243 . 


387 


87 


913 


AJ243721 


Homo 
sapiens] 

Y92508 13- 

/irK -zUUU Ub- 

OCT-1998 
Human OXRE- 

K f Homo 

sapiens 


dTDP-4-keto-6-deoxy-D-glucose 
4-reductase 


1710 


100 


914 


U24189 


is elegans 


hypothetical protein 1207-1/ 
Method: conceptual 
translation supolied by 
authors 


244 

i 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


916 


AE000984 


Arch a e og 1 obu 
s fulgidus 


activating glycohydrolase 
(draG) 


171 


fib 


913 


M23159 


Cricetus 
cricetus 


DHFR-coamplif ied protein 


163 


30 


919 


L12018 


Caenorhabdit 
is elegans 


putative 


1232 


41 


920 


AF102177 


Homo sapiens 


tumor antigen ShP~8p 


1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD- repeat protein 


866 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


51 


925 


X71978 


Mub musculus 


Fi£ 


1503 


95 


926 


K92288 


Drooophila 
melanogaster 


beta- spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 . 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_jl . 


2249 


100 


93 0 


AJ224326 


Homo sapiens 


r ibu lose - 5 - phospha te - 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 | 
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SEQ 
ID 
NO: 


access r ON 

NUMBER 


SPECIES 
is elegans 


DESCRIPTION 

cm21c7 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


932 


AL0B0065 


Homo sapiens 


hypothetical protein 


210 


25 


933 


G01384 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5965. 


767 


" "98 


934 


AJ276485 


~ Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 
936 


Al>035681 
AB026808 


Homo sapiens 
Mus musculus 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 
synaptotagmin XI 


1142 


80 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


2142 
2601 


95 
"99 


938 


X65724 


Homo sapiens 


0RF2 


498 


100 


939 


W89024 


Homo sapiens 


Po 1 yp ep t i de z ragmen t encoded 
by gene 156 . 


1487 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 


AF094583 


Homo sapiens 


putative HIV-i infection 
related protein 


452 


100 


942 
943 


AC024200 
AF129756 


Caenorhabdit 
is elegans 

Homo sapiens 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 
GSc 


350 


""69 


944 


K23765 


Rattue 
norvegicus 


alpha - 1 ropomyos in 


273 
133 


100 
96 


945 
946 


AC009917 
AF223468 


Arabidopsis 
thaliana 
Homo sapiens 


Contains similarity to 
AD02I protein 


583 


47 


94 7 
94 8 


AF055473 
X7S756 


Homo sapiens 
Homo sapiens 


GAGE - 8 

protein kinase C mu 


551 
273 
2019 


44 
51 
68 


949 
950 


AF143956 
Y36729 


Mus musculus 

Homo 

sapiens 


corcnin-2 

Human PG1 protein sequence. 


2300 
1861 


93 
99 


951 


W49041 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


282 


67 


952 


AB016881 


Arabidopsis 
thaliana 


gene_id : MXC17 . 7 - 


203 


46 


953 


Y01785 


Homo sapiens 


Human ubi qui tin -conjugating 
enzyme >Y25341 Y25341 01-JUL- 
1999 12-AUG-1998 Human NCE-2 
protein. 


36S 


100 


954 
955 


AF14S615 
U09410 


Drosophila 
melanogaster 
Homo sapiens 


zinc finger protein ZNF131 


823 
2483 


46 
99 


956 
957 


U09410 
AF195623 


Homo sapiens 
Homo sapiens 


^. -i.ij.k_. i~ .Li iy l. protein ZririJi 
chol inephosphotransf erase 1 
alpha 


18 53 
2126 


99 
99 


958 


X94917 


Drosophila 
me 1 anoga s t e r 


head-elevated expression in 
0.9 kb 


155 


32 


959 
960 


U54807 
AF058807 


Rattus 
norvegicus 
Bos taurus 


GTP-binding protein 
GTP -binding protein rah 


1167 - 


97 


961 
962 


G03244 
AF078850 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


606 
471 


97 
100 


963 ■ ■ 


AP001754 


Homo sapiens 


steroid dehydrogenase homolog 
transient receptor potential- 
related channel 7, a novel 
putative Ca2+ channel protein 


583 
317 


40 
30 


964 


AL035419 


Homo sapiens 


dtfll00H13.i <putative novel 
protein) 


1129 


100 


965 


X613&1 " - 


Rattus i 
rattus 


interferon- induced protein 


202 


46 


966 


D38169 


Homo " 
sapiens 


inositol 1, 4, S-trisphosphate 
3-kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 

sapiens j 


U465N24.2.1 (PUTATIVE novel 

protein) (isoform 1) 


B93 


100 
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SEQ 
ID 
NO: 


I ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


611 


100 


969 


AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (long isoform) 


2752 


99 


970 


AF281134 


Homo sapiens 


exosome component Rrp4 6 


1186 


100 


971 


U53336 


Caenorhabdit 
is elegans 


weak similarity over a short 
region to myosin heavy chain 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840.12 


589 


53 


973 


AF188504 


Mus musculus 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 
1 


hunt ingtin- interacting 
protein HYPA/FBP11 


1390 


97 


976 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 


977 


G040i6 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


978 


AF164797 


Homo sapiens 


ribosomal protein 1*17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcriotion factor xtiMiri 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsecfuestrins 


2029 


10 0 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


962 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 




.fooi suDunit ot cne NADH : 
uux^uxiiuue UAiuoreuuc Las3 
complex 


964 


85 


984 


AJ249207 


sp. AD45 


T*fc1 1 J3*l ^ "f \TSA "K^J** i*^ m -n err e» 


3 51 


43 


985 


Z30093 


Homo sapiens 


bJLdnscrip&ion xaccor z , 

3 5 kD *5uhnri i t* 


15/6 


99 


986 


AB030835 


Homo sapiens 


contains two ctl nt*am*inp ri rh 
domains, three zinc- finger 
domains, and mat r in 3 
homologous domain 3 <MH3) 


ACQ7 


' "q"q 


987 


AF22725B 


Bos taurus 


RPGR- interacting protein- 1 


1262 


j a 


988 


AL022238 


Homo sapiens 


dJ1042Kl0.2 (supported by 
GEMS CAN , FGENES and GENEWISB) 


4048 


99 


989 


AL022238 


Homo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES and GENEWISB) 


2321 


99 


990 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


'991 


AF161426 " 


Homo sapiens 


HSPC308 


446 


92 


992 


AF161426 


Homo sapiens 


HSPC308 " 


4 53 


92 


993 


AI,023859 


Schizosaccha 

romycea 

pombe 


trna-splicing endonuclease 
subuni t 


172 


42 


994 


AL049631 


Homo sapiens 


dJ513M9.1 (novel Horoeobox 
domain protein) 


241 


47 


995 


AC005253 


Homo sapiens 


R26445 1 


902 


100 


996 


AF265206 


Homo sapiens 


M0G1 isoform A 


974 


100 


997 


AJ2482B5 


Pyrococcus 
abyssi 


sar cosine oxidase, subuni t 
beta (soxB) 


195 


28 


998 


AH003641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product 


218 


SB 


999 


W693 43 


Homo 
sapiens 


Secreted protein of clone 
CR930_1 . 


13 40 


QQ 

79 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 i 


AF208844 


Homo sapiens 


BM-002 


428 


ioo ! 


1003 


AE004944 


P s eudomona s 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Cams 
familiaris 


centractin 


1949 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


1006 


S45347 


Can is 

f arailiarie 


centractin 


1315 


98 


1007 


AB022158 


Mus 

mus cuius 


chaperonin containing TCF-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


12 82 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


' 269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH1 0333 


1244 


52 


1015 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO: 374. 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


sex-regulated protein janus-a 


674 


100 


1021 


AF19062S 


Cotumix 
cotumix 


qdgl-1 


638 


96 


1022 


AL133363 


Arabidopsis 
thaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


U8073<i 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqu it in- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus mus cuius 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

porabe 


DNA2-NAM7 helicase ramily 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment or" human secreted 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF0254 59 


Caenorhabdi t 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U3 7251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protean 
BA0306 . 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiqu it in -like protein B 


331 


80 
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SPECIES 
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SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1040 


AF290204 


Homo sapiens 


blood group carrier molecule 
DOK1 


1637 


99 


1041 


Y9673 0 


Homo 
sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 


AF140683 


Mus musculus 


F-box protein FWD2 


2397 


98 7 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AP181631 


Drosophila 
melanogaster 


BCDNA.GH04929 


204 


37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6 - phosphogluconol ac tonase 


1317 


100 j 


1047 


AB035863 " 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034S50 


Homo sapiens 


dJ1184F4.2 (novel protein 
similar to nucleolar protein 
4 {N0L4) (NOLP)) 


981 


92 


1049 


AF163825 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein Jb30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl - l 


236 


85 


1052 


AE003S29 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 


1054 


A1p162756 


Neisseria 
meningitidis 


Glu- tRNA (Gin) 

amidotransf erase subunit A 


682 


A. A 


1055 


AF131856 


JRattus 
norvegicus 


tRNA selenocysteine 
associated protein 


1525 


99 


1056 


U89649 


Chlamydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


244 




1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


Jceratinocyte annexin-like 
protein pemphaxin 


1710 


99 — 


1053 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1050 


AF224263 


Heterodontus 
f rancisci 


HOXD8 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


Streptomyces 
coelicolor 
A3 (2) 


hypothetical protein 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10). 


2547 


100 


1064 


AF263614 


Homo sapiens 


acetyl- CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquirex aeolicus 
GTP-binding protein; similar 
to AE000771 (PID;g2984292) 


662 


98 


1067 


Y18930 


Sulfolobus 
solfataricus 


hypothetical protein 


162 


29 


1068 


R65969 


Homo 

sapiens T98G 


Gl ioblas toma-derived 
polypeptide. 


887 


100 


1069 


Y07964 


Homo sapiens 


Human secreted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adiican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970 . 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


107* 


V13392 


Homo sapiens 


Amino acid sequence of j 


1271 


91 
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SMITH- 
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SCORE 


% 

IDENTITY 








protein PR03 28. 






1076 


AF161457 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapiens 


Human carbohydrate- associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


u 


1079 


AL132965 


Arabidopsis 
tha liana 


putative WD-40 repeat -protein 


286 


29 


1080 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
protein 


579 


100 


1032 


AF016416 


Caenorhabdit 
is elegans 


F29A7,4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


Mus tnusculus 


unnamed protein product 


1S1 


44 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


202 


97 


1086 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
protein 


114 2 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563. 


613 


100 


1090 


AK023 982 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein . 


606 


100 


1093 


U34973 


Mus musculus 


protein tyrosine phosphatase - 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


522 


5"6" 


1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


1029 


99 


1096 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


98 


1097 


AF161455 


Homo sapiens 


HSPC337 


742 


98 


1098 


U80029 


Caenorhabdit 
is elegans 


similar to thioredoxin 


242 


39 


1099 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1321 


99 


1100 


AJ00586S 


Homo sapiens 


Sqv-7-like protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ00586S 


Homo sapiens 


Sqy-7-liJce protein 


1016 


99 


1103 


AL110244 


Homo sapiens 


hypothetical protein 


299 


31 " 


1104 


AF242194 


Drosophila 
raelanogaster 


brakeless-B 


147 


52 


1105 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 


U28016 


Mus musculus 


parathion hydrolase 
(phosphotri est erase) -related 
protein 


1624 


87 


1107 


AJ278150 


Homo sapiens 


putative lipid kinase 


2207 


99 


TICS * 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 7814 . 


495 


98 


1109 


AC si X 1 £ O / 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


lin 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 
1114 


AF182076 
G04039 


Homo" 
sapiens 
Homo sapiens 


glioma tumor suppressor 
candidate region protein 2 
tfuinan secreted protein, SEQ 


2418 
475 


100 
96 
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DESCRIPTION 
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SCORE 


% 

IDENTITY 








ID NO: 8120. 






1115 


AF229439 


Kus musculus 


zinc ringer protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo sapiens 


thyroid receptor interactor 


404 


85 


1118 


A12155 


Homo sapiens 


Human X5L cDNA. 


1673 


100 


1119 


AL161542 


Arabidopsis 
thaliana 


isomerase like protein 


607 


53 


1120 


AJb023754 


Homo sapiens 


dJ272L16.1 (Rat 

Ca2+ /Calmodulin dependent 

Protein Kinase LIKE protein) 


2341 


98 


1121 


Y57901 


Homo sapiens 


Human transmembrane protein 
ETMPN-25. 


321 


36 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 


455 


77 


1123 


AP225418 


Homo sapiens 


lipase 


1531 


97 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035690 


Homo sapiens 


dJ202I2i.i {novel protein) 


952 


100 


1126 


Aa000217 


Homo sapiens 


CLIC2 


1286 


99 


1127 


AB030505 


Mus musculus 


UBE- lc2 


1069 


79 


1128 


Y73375 


Homo sapiens 


HTRM clone 1427838 protein 
sequence . 


874 


inn 


1125 


Y78941 


Homo sapiens 


Cyclophilin-type peptidyl 

Drolvl cis/tran^ "i Qnnnpra «jo 

amino acid sequence . 


877 


100 


1130 


AL023553 


Homo sapiens 


dJ347H13 .'4 " (novel protein) 


557 


100 


1131 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


i / rift 


100 


1132 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z68197 


Schi zosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 "'" 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus rau a cuius 


enhancer of polycomb 


264 


41 


1136 


M6"2419 


Mus musculus 


clathrin-associated protein 


2189 


99 


113 7 


AJ006219 


Drosophila 
melanogaster 


clathrin-associated protein 


1254 


78 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95. 


440 


98 


1139 


W88104 


Homo 
sapiens 


A Rab protein designated 
HRABS-2. 


1065 


99 




1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 




1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product. 


3309 


100 




1142 


Y13402 


Homo sapiens 


Amino acid sequence of 
protein PRO310 . 


1694 


99 




1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


560 


99 




1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 




1145 


Y12917 


Homo sapiens 


Amino acid sequence o£ a 
human secreted peptide. 


1094 


100 




1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 




1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34)) 


1233 


100 




1148 


G02S48 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


93 




1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 




1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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ID 

NO: 


o x i_/rv 
NUMBER 






OMI i tt- 
SCORE 


X UdN X X 1 X 








HEAAR60 






1151 


AF044201 


Rattus 
norvegi cus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphatidic acid 
acylt ransf erase -gamma 1 


185$ 


99 


1153 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em : AL050069 ) ) 


872 


64 


1154 


AF131852 


Homo sapisns 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR03S2 protein 
sequence. 


1381 


97 


1156 






ID NO: 8117. 


DU / 


99 


1157 


AF112444 


Lupinus 


L-asparaginase 


287 


43 


1158 


AF151848 


Hnmn oani nn n 


Cm— Qfl nrnf- *» -i r> 

wi ?u pxottixn 




32 


1159 


XtU f 0 ^ u / 


Homo s sip i 6 ns 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


It) 'JjU 


Homo sapiens 


Human signal peptide 
containing protein HSPP-10 7 
SEQ ID NO: 107. 


746 


83 




JO / JJU 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
cwn tti wo ♦ i m 


746 


83 


116*3 


AF113 534 


tiotiio Sapiens 


tifx— air t** procein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


/UJX X O 3 w X 


Homo sapiens 


auii9iNio.i tA novel protexn 
(translation of the cDNA 

up.r ^pDooAuy^D, tin: AbusUUb?/ J 


1051 


71 


1166 


AL118501 


Homo sapiens 


CU1191N16.1 (A novel protein 
iLransidLion ox cne cuna 

"AC tipD O ID | Cm : M-LiU / 


945 


76 


1167 


AF187733 


Homo sapiens 


svn^anHi 1 in 

*^ j All— d^JX-X A. JL IX 


831 


42 


1168 


AB019435 


Homo sapiens 


phosphol ipase 


951 


55 


1169 


AF064604 




ivAu^ pxro t-tsxn 




33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
uy gene o « 


1191 ! 


100 


1171 


L03188 


Saccharomyce 

s cptpvi slap 


putative 


180 


22 


1172 


AF1137 51 


Mus mus cuius 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 




332 


T5 


1176 


M35617 


Homo sap i ens 


* — X CUCjVbUXi V — CXX UUCl" U 

alpha region 


2 84 


Q "J 


1177 


AC01268O 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL09*767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Ccienorhabdi t 
is elegans 


similar t*r» ATP flvnthacp Ti 
chain 


4 96 


-~ — ' 


1181 


Yll^lO 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens) 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide . 


T cell leukemia /lymphoma 1 


617 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 








[Homo 
sapiens 








1183 


U42841 


Caenorhabdi t 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth- associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 




1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neuri te outgrowth) 




33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


5~i hnp.nma 1 nrnhp-i n C C 

^AWWOWIHOi. X U LCJ.il OO 

modification protein {rimKJ 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-l 






1193 


Y50926 


Homo sapiens 


Human fetal hraln cDNA clone 
v ^ - 1 - ° J- u.t3ij,vtswi procein . 


918 


100 


1194 


AF02 6530 


Rat tus 
norvegicus 


variant RB3 • 1 


i n q *j 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 


2981 


96 


1196 


Y70470 


Homo s ap i ens 


Human havnAK- mrtl ^r*n\~t a 
ii Ltilldii L- OL ly <i L. luOleCUlCf 

PRG3 protein. 


IboU 


100 


1197 


AF157318 


Homo sapiens 


AD— 017 protein 




4 / 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 
(GB:Z28295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


164 9 


-— 

88 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur Jceratin 


484 


82 


1202 


' Z85986 


Homo sapiens 


dJ108K11.3 (similar to yeast 


1143 


75 


1203 


U18762 


Rattus 
norvegi cue 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


3erky 


2235 


it 


1205 


AB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsic 
thai iana 


ubiquinone/menaguinonc 

biosynthesis 

methyl transferase -like 


762 


56 


1207 


A3J136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurits outgrowth) 


742 


100 


1208 


AF2079B9 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


d*j4 66N1.4 (novel protein 
similar to ANK3 (ankyrin 3 . 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


U21549 


Mu3 musculus 


Ac3 9/physophilin 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814 


Mus musculus 


odd-skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
f owleri 


calcineurin B 


222 


39 


1214 


014849 


Mus musculus 


meiosis-specif ic nuclear 
structural protein l 


19S0 


77 


1215 


GO3022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103 , 


590 


100 


1216 


Z72S10 


Caenorhabdi t 


similarity to yeast UTR3 


634 


49 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 
yk677hll.5 comes from this 
gene 






1217 


Z49703 


Saccharomyce 
s cerevisiae 


unknown 


134 


22 


1218 


AC01343 0 


Arabidopsis 
thaliana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


SB 


1221 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AP1S5100 


Homo sapiens 


zinc finger protein KY-REN-21 
antigen 


2261 


100 


1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma -6 subunit 


356 


100 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 . 


X64002 


Homo sapiens 


RAP74 


2661 


99 


"1227 


X04085 


Homo sapiens 


catalase 


234G 


100 


1228 


AJ005620 


Mus mus cuius 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


1.08239 


Homo sapiens 


located at OATLl 


2274 


100 


1232 


AF121863 


Homo sapiens 


SOTft" ^ fid nay^n 1 ~A 


1964 


100 


"1233 


AF121863 




OVJJL l_ -Lily IlCAJLtl XtI 


1203 


84 


1234 


AC024805 


Caenorhabdi t 
is elegans 


containfi flimi 1 avi ♦••vr f*A 
TR :O04595 




31 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Gaccharomvcfta cersviH'iap 
probable membrane protein 
YLR418C (GB:U20162) 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage act in-associated- 
tyros ine -phosphorylated 
protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


. 1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483 


Gallus 
gallue 


Yes-associated protein 
(65kDa) 


574 


48 f 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


1246 


AJ276003 


Homo sapiens 


GAR1 protein 


1216 


iob 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34 . 


1369 


98 


124B 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactos aminyl t ransfera 
se; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


1250 


Y13148 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M248S2 


Rattus 
norvegicus 


neuron- specific protein PBP- 
19 


124 


46 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1252 


AF14673 8 


Rattus 
norvegi cus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 68C6. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC41195 1 


831 


78 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl-tRNA 
tran3formylase 


1556 


88 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214 . 


2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RFP transforming 
protein; similar to P14373 
(PID;gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g380477S) 


469 


100 


1261 


VOO507 


Homo sapiens 


coding sequence of DHFR {1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp. 


gamma -glutamyl t ranspeptldase 
(AA 1-568) 


697 


32 


1263 


AF173871 


Mus musculus 


neuronal PAS 3 


977 


" 94 


1264 


AF178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein-1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PROS41 protein 
sequence. 


1622 


100 


1267 


AF061346 


Mus muo cuius 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdi t 
is elegans 


C13F10.4 gene product 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Kab3v 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 


55 


1272 


AF201933 


Homo sapiens 


DC11 


650 


1O0 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AL021710 


Arabidopsis 
thai i ana 


putative protein 


348 


49 


1275 


AC004449 


Homo sapiens 


R33663 3 


556 


100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 
HL2AGB7, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 




ituuicui xiy uiuidse protein^y 
(HYDRL-9) . 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-cxon 


478 


100 


1279 


Y66695 


Homo 
sapiens 


PR01344 . 


1909 


100 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y48610 


Homo sapiens 


Human breast tumour - 
associated protein 71. 


779 


100 


1282 
1283 


AC015446 
AK024432 j 


Arabidopsis 
thaliana 
Homo sapiens 


Similar to Aid protein 
FLJ00022 protein 


406 


35 


"12 84 
1285 


WS^1S3 
AJ001019 


Komo sapiens 


Human FADD- interacting 
protein (FIP) . 
ring finger protein 


403 
1825 


35 
81 


1286 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product 


1301 ; 
195 


100 
29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2135214) 


1195 


100 


1289 


ACD06033 


Homo 
sapiens 


similar to MI^T 64; similar to 
138027 (PID:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


rU3A 


351 


54 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCR I P T I ON 


WATER MAM 

SCORE 


XUC.N I XT I 


1291 


Z73424 


Caenorhabdi t 
is elegans 


C44B9. 1 


23S 


36 


1292 


Y94871 


Homo 
sapiens 


Human protein clone HP02551, 


1222 


100 


1293 


AF130425 


Homo sapiens 


retinoblastoma-associated 
protein RAP140 


489 


29 


1294 


G03856 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937. 


538 


99 1 


1295 


AF133670 


Mus nvusculus 


ARL-6 interacting protein-2 


367 


51 | 


"1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


inn 


1297 


X57560 


Escherichia 
coli 


pspE protein 


53* 


t f\f\ i 


1298 


AF169284 


Homo sapiens 


LIM and cysteine- rich domains 
protein 1 


1997 


- nn 

1UU 1 


1299 


U41023 


Caenorhabdi t 
is elegans 


coded for bv C. eleaans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


3 24 


29 I 


1300 


AB024523 


Homo sapiens 


basic kruppel like factor 


1206 




1301 


X5S989 


Homo sapiens 


eosinophil cationic- related 
protein 


737 


99 | 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


1303 


X52904 


•Escherichia 
coli 






100 1 


1304 


U19577 


Escherichia 
coli 


galactonate dehydratase 


242 


93 


1305 


AF266508 


Mus mus cuius 


NEIiF protein 


1409 


97 | 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


932 


100 | 


1307 


U58750 


Caenorhabdit 
is elegans 


similar to the mitochondrial 
carrier family 


365 


54 


1308 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 | 


1309 


AL.078593 


Homo sapiens 


dJ210Bl.l <KIAA0680i 


2 67 ~~ ' 




1310 


X82693 


Homo sapiens 


E48 antigen 


620 


96 


1311 


Z82263 


Caenorhabdi t 
is elegans 


C47A4.1 


^ a * 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 




100 


1313 


Y41763 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


16 36 


1 DO ! 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


i nn 1 

IvU | 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


9 7 1 


1316 


Y6669S 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 ■ " 


100 | 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 | 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 | 


1319 


"AF153127 " 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Homo sapiens 


23 Ic D h icf h 1 V haair nrnl-ATri 


1044 


100 


1321 


AF174605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Homo 

sapiens j 


F-box protein Fbx25 | 


467 


70 j 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 | 


"1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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S EQ 
in 

X Lf 

NO: 


ACCESSION 

VTTTMT3 CTl 


SPECIES 


DESCRX PTION 


SMITH - 

MA TT?D M7SNT 

SCORE 


% 

IDENTITY 






r e fc rovi ru s 








13 24 


nXiX J OOOL) 


Arabidopsis 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


1326 


rUi<LJ J6lO 


Homo sapiens 


Villi AQT.7 O f n^iml nynhBi' t-i 

Dn X\J o Xi / . c* \ novel procciri 

similar to rat tricarboxylate 

—j y y »? 1 
V«- CX LilCi / 


1 "JOT 


99 


1327 


x v j, w> *i x 




XiCJ r LUOD 


135 7 


.__ 

99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


nod _ 


Xtxu y xu 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 

133 2 " " 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc -finger protein ZBRK1 


411 


91 


1334 


282271 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KI?4 comes from 
this gene 


578 


44 


133S 


AE000810 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


290 


43 




\fC 0770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


133 7 


/UUJ 


Mus musculus 


protein phosphatase 


378 


84 


1338 


U648S6 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


13 3 9 


AE0013 94 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-11 protein 


2 04 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 6 83 98- 
67B81 


289 


45 


1342 


AU276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


13 44 


ArnngQg^ 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PID:g4650844) 


894 


35 


1345 


J-LT A3 / %DQ 


Homo sapiens 


N-acetylneuraroinic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


nwmo sapiens 


Human secreted protein 
fragment encoded from gene 
64 . 


114 8 


100 


1347 


AJ272073 


T"fi v*n a H 

marmorata 


niciAe bteriiiLy prouein ^-ilKe 


1 CCA 
X O O f£ 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 




TJf i % rr> ^ a A r** >*q f- a v\v*/\t*a^ v% 

riuiuaii secreicQ proccin 
HOSBI96 . 


1117 


100 


1351 


G02144 I 


Homo sapiens 


Human nprTPhpri vn-v-o^ n enn 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328 


Homo sapiens 


R26660JL, partial CDS 


870 


74 


1355 


AC024876 j 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1 CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


13S9 


AF217188 


Mus musculus 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


2NF234 


3869 


100 


1361 


AL16-3279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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TABLE 2 



SBQ 
ID 
NO: 


ACCF*3S TON 
NUMBER 


i SPECIES 




SMlTH- 
WATFRMAN 

SCORE 


XUJCiXM lj.li 








element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z48475 


Homo sapiens 


glucokinase regulator - 


2682 




1364 


AF195764 


Homo sapiens 


HieoaJcai'vocvt &t"ih anrprf cr^n#> 
transcript 1 protein; MEGT1 
protein 


2055 




1365 


AF116609 


Homo sapiens 


PRO091S 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1367 




t_7 r-\m o^ni one 


aoo /ooiu . j tnovex procem 
similar to C. elegans 

J.17D1U . D \ 1 r . VJZZ 33 / J ) 


2581 


99 


1368 


Y34124 


- Homo 

sapiens 


Human nnhncoi urn nhannal 

riutucliL puLdBSiuin CildilliSj. 

K+Hnovi5 . 


1l/t 9 


10 0 


1369 


AJ24 5S21 


Mnrrv^ eani one 


ptocein 


372 8 


99 


1370 


AF00 8220 


subtilis 


X Cava 


429 


45 


1371 


X05562 




oxpfia-z cnoin precursor iaa - 

LU IU10 I VJIlD XS cUU DaSc 


5908 


99 


1372 


Z98048 


Homo sapiens 


dJ4 08N23.4 (novel DnaJ domain 


1296 


99 


13 73 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


TJ20286 


Ra 1 1 us 

TlOWPcH fllQ 

v cy -^uo 


xanixna assocxaceu. poiypepLiae 
1C 


Tug-. 

lDQ / 


— — 

& y 


"13 75 


U^344S 


Homo qani q 


DOCl 


164 5 


4 6 


1376 


AL117337 


Homo 
sapiens 


37«?u xo . x \sinc xxnger 

T3T"ot- pin 33a ffeny \ \ 
i- uiciji j j ci \ a 3 x j j 


OCA 

£. J U 


ou 


13 77 


AC005328 


Homo sapiens 


R2 666 0_l, partial CDS 


1126 


100 


1378 


U35113 


Homo RaoiptiR 


luctabiabib'dssocidLea gene 




69 


1379 


L153 13 


is elegans 


£->U tat lvo 


oco 


5B 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Wnnin eani one 
iii^iinj o ci^xcuo 


JtoN An Z>XM 


959 


97 


1383 


AF237676 


Ml > a imiBriil iir 




1721 


96 


13 84 


AF237676 


Mi i cj mi i qni "1 tier 


G beta-like protein GBL* 


1043 


70 


1385 


Y58793 


Homo aan{ P»n «3 


numan caxcxurrc re^uiocory 
prot e in CaREG- 1 . 


/lb 


ICO 


1386 


AF212162 


Homo sapiens 


nine in 


10369 


' OQ 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004 8 90 


Homo <?ani prm 


similar to zinc f xnger 
proteins: similar to BAA243B0 
>W06316 W06316 03~OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 


1389 


AF187989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 ■ 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


"97 


1392 


AF282265 


Homo sapiens 


Inner centromere protein 
INCENP 


1794 


99" 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc f inaer nrotein SRHT21 i 


3 208 


qq 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


1396 ■ 


AG004809 


Arabidopsis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJl068H6 . 4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72 . 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


j DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1403 


Y79222 


Homo 
sapiens 


Human transferase TRNSFS-14. 


2842 


100 


1404 


X81058 


Mus musculus 


tex261 


1010 


99 


1405 


AB012084 


Mus musculus 


ITM 


194 


29 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


Rat bus 
rattus 


PTB-like protein 


2684 


99 


1406 


X75760 


Drosophila 
Tielanogaster 


LRR47 


364 


" 29 


1409 


U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


AC00S578 


Homo sapiens 


F20 887_l, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein. 


360 


100 


1412 


X01563 


Escherichia 
coli 


L5 (rplfi) (aa 1-179) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L- Jcvnurenine /alnha - 
aminoadipate aminotransferase 


22 02 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


12 62 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


aura t us 


protein beta 5 


4.x fit 


76 


1420 


AL162458 


Homo sapiens 


bA465L10.5 (KIAA1176 (novel 
piuLciu, pre suiiicu ojccnoj-og 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UN0785) amino" 
acid sequence SEQ ID NO: 3 08. 


1 52 




2 9 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsl4_ 3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-ampl i t ied 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y48S17 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208848 


Homo sapiens 


BM-006 


853 


79 


1427 


AF112886 


Bos taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161534 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AF125043 


Mus musculus 


bisphosphate 3 1 -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane-bound protein 
PRO1106. 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


192 [ 


34 


1434 


R99900 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 '"' '" 


1435 


AF220530 


Homo sapiens 


myo- inositol 1-phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


oridging mtegrator-3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 


1439 


AO"293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GC5A3 long isoform 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


3; ' 

IDENTITY 


1442 


AB039669 


Homo sapiens 


ALEX3 


1944 


100 


1443 


AF237711 


Drosophila 
melanogaster 


Diablo 


191 


27 


1444 


AJ011896 


Homo sapiens 


Nafl beta protein 


439 


39 


1445 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


9d 


1446 


AF214114 


Homo sapiens 


breast carcinoma- associated 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


1448 


AF003136 


Caenorhabdi t 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY-REN- 50 antigen 


1184 


89 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54_l, SEQ ZD NO: 48. 


985 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2 -binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


afca^ein ?-hin^lna nrnhPin 


'SO 


/ o 


1453 


Z38011 


Mus musculus 


DMR-N9 


8 82 


56 


1454 


X90568 


Homo sapiens 


annotation available soon via 
LABEIT6EMBL-Heidelberg.DE 




28 


1455 


AL035409 


Homo sapiens 


~dJ564Mll.3 (similar to 
si alyltranf erase) 


1356 


100 


1456 


D44480 


Mus musculus 


MATH - 0 rvi-fkh t>i n 


4* / « 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 


1459 


AF242552 


Gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 | 


1461 


AB02S258 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase -like 
phosphodie s t e rase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs 243979 
(14 ID :g573U97 i , R19699 
<NTD:g774333) 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs Z43979 
INlu : gb/3097 ) , R19699 
(NID:g774333) 


869 


98 


146^5' 


U32743 


Haemophilus 
influenzae . 
Rd 


fucose operon protein (fucU) 


315 


50 


1466 


V69622 


Homo sapiens 


Not56-like protein 


2342 


100 


1467 


AC003034 




nomoiog ox. rat Kioncy* 

specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleiracea 
( 


ribulose-1, 5-bisphosphate 
subunit N— methyl transferase I 


333 


26 


14$9 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN-54. 


1053 " 


i nn 

1UU 


1470 


AF032666 


Rattus 
norvegicus 


rsec5 


4504 




1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein- 17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Riboaomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTS1 


1101 


ou 


1475 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdi t 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3/ similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


uiubJb | Pan paniscus 


MHC. class I A 


675 


84 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1481 


AL078599 


Homo sapiens 


dJ9SlC6.l (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


1274 " " 


65 


1492 


Z98977 


Schizosaccha 

romyces 

pombe 


putative vacuolar protein 


256 


29 


1483 


AB00S662 


Mus musculus 


JNK/SAPK-associated protein- 1 


4968 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 


575 


99 


1487 


X84156 


Saccharomyce 
s cerevisiae 


ATH1 


341 


29 


1488 


AF03B963 


Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
y:<30b3.5; coded for by C. 
elegans cDNA yk3 0b3,3 


620 


42 


1490 


AE0009B9 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydra tase (fad- 4) 


533 


A C 

** o 


1491 


M80633 


Rattus 
norvegicus 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3513 


99 


1493 


Y17220 


Homo sapiens 


Human secreted protein (clone 
fj283-ll). 


462 


O I 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 ' " 1 


149S 


Y94897 


Homo 
sapiens 


Human protein clone HP10574. 


13^71 


i nn 


1496 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


2.Q0 


1497 


AF037447 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophilum 


putative target YPL207w of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


14 9 9 


AB039947 


Homo sapiens 


XllL-binding protein 51 


227 


3 6 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 


ALOS03 3 3 


Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF179895 


Homo sapiens 


TALE homeobox protein Me is 2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


1504 


Y53005 


Homo sapiens 


Human secreted protein clone 
pm74 9_8 protein sequence SEQ 
ID NO:16. 


1442 


99 


1505 


X82494 


Homo sapiens 


f ibulin-2 


3580 


99 


1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


AL034548 


Homo sapiens 


dJ1103G7.6 (novel protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF2201B2 


Homo sapiens 


uncharacterized hypothalamus 
protein HT0 08 


1181 


98 


1510 


"U646"01 


Caenorhabdit 
is elegans 


Gene probably begins in the 
next cosmid 


415 


58 


1511 


ALB 56192 


Neurospora 
crassa 


related to MDM1 protein 


196 " ■ 


29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513 


AF168717 


Homo sapiens 


x 009 protein 


694 


99 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


ACD03672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 r 


1374 


90 i 


1517 


AF003140 


Caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


1518 


AB002584 


Rattus 
norvegicus 


be t a - a 1 ani ne - pyru va t e 
aminotransferase 


2238 


82 


1519 


AJL121764 


Schizosaccha 


yeast atpl2 protein precursor 


270 


30 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






romyces 
pombe 


homolog 






1520 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1S21 


D31764 


Homo sapiens 


KIAA0064 


_ 170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane -bound protein 
PRO190 . 


985 


100 


"1523 ™ 


""794450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


Arabidopsis 
thaliana 


F17F8.22 


277 


37 


1525 


AF109377 


Mus musculus 


ldlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase- like 
phos phodi esterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJ00012 protein 


£11 


100 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidaoe 


679 


100 


1530 


AF205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; 

nuiioro 


5707 


99 


1534 


AC007190 


Arabidopsis 
thaliana 


F23N19.9 


374 


"5 V 


1535 ■ 


AB027564 


Homo sapiens 


DINB1 


4482 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3 693 


99 


1538 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 
1540 


AF266756 
Z48804 


Homo sapiens 
Homo sapiens 


cphingooine kinase 
OAl 


2011 


99 


1541 


AF000195 


Ca e norhabd i t 
is elegans 


Contains similarity to Pfam 
domain: PF00169 <PH) , 
Score=20.6, E-value=l . 9e-05, 
N=l 


2238 
379 


100 
42 


T542 

1543 


Y71159 
X76092 


Homo sapiens 
Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegal in . 

DNA binding protein RFX3 


9415 
3327 


99 
100 


154 4 
154 5 


AB015330 
AF198487 


Homo sapiens 
Homo sapiens 


HRIHFB2007 

transcription factor LBP-lb 


631 
2822 


50 
100 


1546 


AF016417 


Caenorhabdit 
is elegans 


Similar to BZIP transcription 
factor 


S18 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 
1549 


AB035495 
AL021707 j 


Carassius 
auratus 
Homo sapiens 


ubigui t in** ac t ivat ing enzyme 
El 

dJ508I15.4 (KIAA0668) 


836 


42 


"irc'a 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


3688 
2 92 


100 
42 ■ 


1551 


AF145615 


Drosophila 
melanogaeter 


BCDNA.GH03377 


822 


44 


1552 
1553 


AL157734 
AF079S2 7 


Schizosaccha 

romyces 

porabe 

Mus musculus 


putative mannosyl transferase 
mvoivea in w-giycosylation 

*ER5 


435 


37 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


691 
1099 


63 
88 


1555 


Y44722 


Homo sapiens 


Human Immune system molecule, 
ISMO-3. 


1780 ' 


99 


1556 
1557 


AF116553 
Y7105£ 


Drosophila 
melanogaster 
Homo sapiens 


antennal- specific short -chain 
dehydrogenase/reductase 
iuman membrane transport 


277 
1975 


32 
99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








protein, MTRP-1. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-l. 


1894 


97 


1560 


AF092050 


Mus mus cuius 


beta-1, 3~N- 

ace tylglu co saminyl trans f erase 


262 


44 


1561 


AD109827 


Homo sapiens 


6\J3 09K20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4 ) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 ' 


15€J3 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216 1 


919 


82 


1566 


AF000195 


Caenorhabdit 
is elegano 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value=l . 9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD -repeats protein 
beta-TRCP2 isoforra C 


2879 


100 


15^8 


D4 9473 


Mus musculus 


truncated form of Soxl7 


1047 


78 


1569 


AK025270 


Homo sapiens 


unnampti oicotein nrnHn r> t- 


210 




1570 


X75756 


Homo sapiens 


Drotein !cina<?p C rmi 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHJP-1 


2388 


100 


1572 


AE003831 


Drosophila 
tnelanogas t er 


CG18445 gene product 


180 


31 


1573 


AF074603 


Streptomyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1574 


U2 8993 


Caenor habdi t 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo s ap i e ns 


transcription fact-or T CRPQ O 


287 


68 


1576 


X64B78 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophii a 
me lanogas ter 


Diablo 


421 


z>t 


1578 


G00975 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF24 8744 


Cryptosporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


d«7585I14.2 (novel protein 
(translation of cDNA 
Em:AK000219) ) 


663 


100 


1581 


AF041853- 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein DIPS 


1198 


100 


1583 


AE001803 


Thermotoga 
maritiraa 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane protein FLRT1 


3494 


99 


1584 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


1587 


X79440 


Homo sapiens 


NADP+ -dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


flavohemoprotein b5+b5R 


2$£3 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 



184 



WO 01/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


I DENT T TV 
ill! 






pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb649_3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx2 5 


1408 


99 


1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


16O0 


X82200 


Homo sapiens 


gpStafSO 


2305 


100 


1601 


Y0087€J 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA- interacting protein 3 


2821 


99 ' 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


1605 


AF185576 


Mus mus cuius 


POZ/zinc finger transcription 
factor ODA-8 


343 5 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


iFN-pseudb- omega 2 


800 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1868 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


3765 


JLU U 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 




1613 


AC004481 


Arabidopsis 
thaliana 


nodul in-like protein 


371 


2* 


1614 


Y09501 


Homo sapiens 


NADH- cytochrome -b5 reductase 


1607 


1UU 


1615 


Y15S21 


Homo sapiens 


start position 1 


3150 


y t 


1616 


AJO 10750 


Rattus 
norvegicus 


Castration induced nvncira t* i r> 
apoptosis related protein- l, 
(CIPAR-1) 


890 




1617 


XS8079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 ' 


Y66678 


Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-55kDa-associated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulqidus Dredicted codina 
region AF0859 


240 


36 


1624 


AI>3 55013 


Schizosaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PROH98 . 


1184 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 " 


Y359JU 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203 . 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


286* 


68 


1630 


AF017096 


Dr osophi la 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


163"3 - 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 1 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA- binding protein 
gb|H36135, gb|Z262O0 come 
from this gene. 


143 


38 



185 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E integrase 


411 


90 


1636 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 
ve8__l derived protein. 


1126 


95 


1637 


AF134593 


Homo sapiens 


Ii-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


95 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l_ 1 protein sequence SEQ 
ID NO: 90. 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM8 8 antigen 


766 


99 


1641 


AF233288 


Drosophila 
melanogaster 


WDS 


358 


" 26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein- 2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 




88 


1645 


W67B16 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 


1156 " " " 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-1 


44 56 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


16-48 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


130D 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC00713 6 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


44 64 


99 


1652 


AL161576 


Arabidopsis 
thaliana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJl84J9.1 (KIAA0601 protein) 


3526 


10 0 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-S. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubicrui tin-specif ic protease 


"l"3"7 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


act in- like protein; (2 act in 
domains) 


320 




1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1*64 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


~2£ 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191 1 


1581 


47 


1668 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


p40 


397 


43 



186 
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TABLE 2 



SEQ 
ID 
NO: 



1669 



1670 



ACCESSION 
NUMBER 



Z99753 



G03130 



3PBCIES 



aa 



Schi zosaccha 

romyceg 

pombe 



Homo sapiens 



DESCRIPTION 



putative NOLI -NOP2- sun family 
nucleolar protein 



Human secreted protein, SEQ 
ID NO: 7211. 



SMITH- 
WATERMAN 
SCORE 



569 



427 



IDENTITY 



47 



97 



1672 
T673 



M96625 



AF174482 
Y51B46 . 



Gallus 
gall us 



cardiac muscle tensin 



Ties' 



Homo sapiens 



polycomb 3 



2005 



54 



99 



1674 



1575 



Homo 



sapiens 



AF255334 



Human 18.1 homolog protein 
fragment . 



233 



794367" 



Homo sapiens 



EXP 3 5 



Homo 
sapiens 



15T 



Human protein clone HP10563. 



109 



29 



29 



"30" 



1677 
1678 



1679 



1680 
1681 



1682 



1683 



Y25712 



Homo sapiens 



Y25712 



Homo sapiens 



Human secreted protein 
encoded from gene 2 . 



3043 



AF1S31S1 



Homo sapiens 



Human secreted protein 
encoded from gene 2 . 



1580 



AF163151 



Homo sapiens 



AK024453" 



AF019236 



Homo sapiens 



Dictyosteliu 
m discoideum 



AJ243459 



Z69369 



Leishmania 
major 



Schi zosaccha 
romyces 
pombe 



dentin sialophosphoprotein 
precursor 

dentin sialophosphoprotein 

precursor 

FLJQQ045 protein ' 



170 



TipD 



1349 



613 



proteophosphoglycan" 



putative GTP- binding protein 



153 



99 



91 



17 



100 



34 



26" 



46" 



1685 



16 86 



1687 



1688 
1689" 



1690 



1692 



1693 



AF286475 



Homo sapiens 



AF191298 



Takifugu 
rubripes 



ERp28 



1334 



AJ275986 



Homo sapiens 



retinitis pigmentosa GTPase 
regula t or- 1 ike protein 



196 



AJ275986 



Homo sapiens 



vacuolar sorting protein 35 



X07311 



Homo sapiens 



transcription factor 



087 



AF240463 



Drosophila 
melanogaster 



transcription factor 



2958 



heat shock protein 



1886 



138 



AJ272078 



Rattus 
norvegicus 



LISl- interacting protein 
NUDE1 



1383 



AJ272079 



Homo sapiens 



AF177942 



Homo sapiens 



Xenopus 
laevis 



APOBEC-1 stimulating protein 
APOBBC-1 stimulating protein 



katanin p60 



1256 



1336 



1664 



100 



100 



100 
88 



43 



83 



68 



60 



66 



1694 
1695- 



Hocno sapiens 



arginine N-methyltransf erase 



1774 



100 



1696" 



1697 



1698 



1699 
1700 



AK000193 



Homo 
sapiens 



AB041035 



Homo sapiens 



Homo sapiens 



protein arginine N- 
methyl transferase 1 -variant 2 
unnamed protein product 



1182 



AB041035 



kidney superoxide-producing 
NADPH oxidase 



1060 



3122 



Homo sapiens 



AF025772 



Homo sapiens 



kidney superoxide-producing 

NADPH oxidase 

C2H2 zinc finger protein" 



2181 



488 



81 



100 



100 



100 



54 



1701 



1702" 



Homo sapiens 



AK022407 



AB024574 



Human ARF- Related Protein- 1 
(HARP-i) , 



938" 



Homo sapiens 



Homo sapiens 



unnamed protein product" 



315 



GTP-binding like protein 2 



1172 



97 



98 
100 



1704 



'05 



AF198092 
AE003573" 



Homo sapiens 



Kus musculus 



zinc finger protein 42" 



Drosophila 
melanogaster 



RP"42~ 



CG12474 gene product 



421 



1057 



161 



"52- 



77 



33 



1707 



Y55927 



Drosophila 
melanogaster 



aquaporin 



164" 



708 
170$ 



U27121 
AE39T710- 



Homo sapiens 



Danio rerio 
Arabldopsis 



Human STLK2 protein. 



2T46" 



G12 



212" 



100 
47 



putative protein - 



505 



50 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


description" . " 


| SMITH - 
SCORE 


% ~ 






thai i ana 








1710 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


Mus musculus 


formin binding protein 30 


4561 ' 


o e 

op 


1712 


AJ011118 


Mus musculus 


skeletal rauaelft and rardiar 
protein 


149 0 


89 


1713 


AF2553 03 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 


1714 


AF255303 


Homo 
sapiens 


membrane -associated nucleic 
acid binding protein 


2960 


1 nn 
1UU 


1715 


U68227 


Rattus 
norvegicus 


Ras- related protein 


511 


51 


1716 


AF168795 


Rattus 
norvegicus 


schlafen-4 


1129 


44 


1717 


AF19S304 


Homo sapiens 


SUMO- 1- specific protease 


5804 


99 


1718 


AL355737 




HMG2 OA 




100 


1719 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1069 


46 


1720 


AF071317 


Mus musculus 


COP9 conrnl cnhnni f *7H 




97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1 CD! 


99 


1722 


G01982 


Homo sapiens 


ID NO; 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdit 
is elegans 


C! i mi 1 ar Tlr» nha v*a i-> H a < n a/9 
oiuuiai to UaCJlaLaCteilZeQ 

protein family UPFO034, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 


586 


92 


1725 


Y94441 


sapiens 


nuuidii Adipose opecinc 


1231 


100 


1726 


AF255443 


Homo sapiens 


CGI-201 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 




1810 


99 


1728 


D10884 






1002 


99 


1729 


Z18529 


Gallus 
gallus 


tensin 


1411 


84 


1730 


Z73423 


Caenorhabdi t 
is elegans 


IjO X OPIPI 1 , £j A. *x _? LI O OUUlC i? 

from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF090891 


Homo sapiens 


pkobioS 


470 


30 


1733 


AJ277724 


Homo sapiens 


histone deacetylase 8 


2015 


100 


1734 


G04050 


Homo sapiens 


Human secreted protein , SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus musculus 


leucine-nch-repeat protein 


3531 


94 


1736 


AF09670 9 


Drosopm la 
virilis 


failed axon connections 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 ~* - 


1738 


L15314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
familv PPOI 773 M— 1 


206 


3 7 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidylinositol specific 


134 


27 


1740 


AL031658 


Homo sapiens 


dJ31Q011 4 fnnvpl nr^oln 

similar to predicted C. 
eleqana an C inteatinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human Anrr^h^^ 
protein sequence, SEQ ID NO. 
173 . 


JLUX J 


99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 , 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Rai guanine nucleotide 
exchange factor RalGPSlA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PR01430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


xy«^y4 Homo sapiens 


Human coenzyme A-uti Using 


842 


100 



188 
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TABLE 2 



PCT/US00/34263 



I SEQ 
ID 
J NO: 


ACCESSION 
NUMBER 


j SPECIES 


DESCRIPTION 
enzyme CoAEN-2. 


! SMITH- 
WATERMAN 
! SCORE 


IDENTITV 


1748 
1749 


AK02443 6 - 
AE0OO877 ~ 


Homo sapiens 
Methanobacte 
riura 

thermoautotr 
ophicum 


FLJ00026 protein 
conserved protein 


1619 
231 


100 
""36 


1750 
1751 


AF101361 ■ 
Y15067 


Drosophila 
melanogaster 
Homo sapiens 


Abnormal X segregation 
ZNF232 


193 
889 


33 
100 


1752 
1753 

1754 


AF251038 
AC003O93 

X690B9 


Homo sapiens 
Homo sapiens 

Homo sapiens 


GAP-like protein 
OXYSTEROL - B 2 NDING PROTEIN; 
■*^* BiiuixariLy co P22059 
(PID:gl29308) 
165kD protein 


822 

" '352 

5703 


100 
57 

99 


1755 
1756 


AL0497S5 
AL031393 


" Homo sapiens 
Homo sapiens 


dJ622L5.3 (novel protein) 
CLJ733D15.1 (Zinc-finger 
protein) 


1039 
2765 


100 
100 


1757 

J 1758 
1 1759 


AB040672 

AL022238 
AF117S53 


Homo sapiens 

Homo sapiens 
Homo sapiens 


UDP-GalNAc; polypeptide N- " 

acetylgalactosaminyltransfera 

se 

dJ1042Kl0.4 (novel protein) 
double homeobox protein 


2020 
776 


99 | 
43 


j 1760 
1761 


V12&65 
AL049712 


Homo sapiens 
Homo sapiens 


HNop56 ■ " 
dJ686C3.2 (nucleolar protein 
hNop56) 


375 

2959 

2595 


54 
99 
99 


1762 


AC002394 


Homo 
sapiens 


Gene product with similarity 
to dynein beta subunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


rormirainotranaf erase 
cyclodeaminase 


877 


100 


1764 


U91541 


Homo sapiens 


human formimino transferase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


594 


- ioo ■- 


1765 
176-6 - 


AB013365 


Bacillus 
halodurans 


•YiqF 


350 


., ... 




Y38421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 


1767 

1768 
1769 
1 1770 
J 1771 


ACO09176 

AK000647 
AJ238982 
U73522 


Arabidcpsis 
thaliana 

Homo sapiens 
Homo sapiens 
Homo sapiens 


putative ribulose-i, 5- 
bisphosphate 

carboxylase/oxygenase small 
subunit N- methyl transferase I 
unnamed protein product 
VNN3 protein 

AMSH 


216" 

737 

2665 

1214 


27 

99 
99 
56 


j 1772 
1 1773 
1774 

1775 
1 1776 


U89435 
S70011 
AL035086 
Y99426 

AF110330 


Kus musculus 
Rattus sp. 
Homo sapiens 
Homo sapiens 

Homo sapiens 


unknown 

tricarboxylate carrier 
dJ44A20.2 (novel protein) 

nunmu rAUXDU<l \UINU/oi3^ amino 

acid sequence SEQ ID NO: 3 08. 
glutaminase 


829 
1604 
2036 
1057 

3146 


86 
95 
100 
99 

100 


1777 

1 1 *7 *7 O 


AJ269529 
Z81579 

Ax 00723 9 


Homo sapiens 
Caenorhabdit 
is elegans 
Homo sapiens 


glycerol 3 -phosphate permease 
cdna EST y*76£1.5 comes from " " 
this gene 
monooxygenase X 


2787 
232 


100 

31 


1779 
1 1780 


AIil09608 
AF254260 


Schizosaccha 

romyces 

pombe 

Homo sapiens 


oxyererol- binding protein 
family 


1875 
64 4 


99 
38 


1781 
1782 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


1729 
247 


100 
50 


1783 


AF295773 
*K024475 ] 


Homo 
sapiens 
^iomo sapiens 


ral guanine nucleotide 
dissociation stimulator 
txJ00068 protein 


142 
1333 


49. 
100 


1784 j 

1785 ( 

1786" < 


"UC024475 1 
303933 I 

582637 I 


*omo sapiens 3 
iomo s ap i ens I 

] 

iomo sapiens 3 


FJLJO00S8 protein 

iuman secreted protein, SEQ « 
CD NO: 8014. 

g lambda-like gene/beta- ; 


J996 
570 

>47 2 


93 
LOO 

LOO ~ 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 








glucuronidase exon 11 homolog 







TRADOCS: 1 4 1 6280. 1 (%CT40 1 ! . DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BT.fi no a n 


Receptor tyrosine kinase 
class III proteins . 


BLO0240B 24.70 8.250e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 

O X VjIM/\ 1 U K JU 


PR00109D 17.04 S.OSSe- 
13 358-3B1 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


6 


HL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 8.920e- 
3 3 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL000 23 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 




BLO 116 0 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.119e- 
09 863-917 


10 


PR00464 


E-CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 4.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.2SSe- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3 . 848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR0 0208A 12.59 $.868e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e-" 
26 302-333 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.3 9 3.2S0e- 
26 302-333 


2^ 


BL00115 


Bukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL0O115Y 11.86 
8.000e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BLOOllSH 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.011e-13 435- 
463 BL00115K 15.03 
J.flkJL/e-lU 617-659 

BLOOllSO 16.76 5.805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 

Q C 1 OTllniiee» "l n a 

UbUOllbS 18.24 

7.968e-10 1010-1052 
BL00115U 10.34 4.47Se- 


"26 


BL004 20 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420A 20.42 4 . 109e- 

1 1 fti— iin tit nft/i^rtiv 
J.J. OJ.-XXU DbUU^^OA 

20.42 8.820e-10 84-113 


27 


BL0005 0 


Ribosomal protein L23 
proteins . 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


26 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN-dependent alpha- 
hydroxy acid 
dehydrogenases proteins . 


BL00S57D 17. 7£ S.O^Be- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR0O629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 

3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


35 


PD012 70 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- ~ 
40 39-79 PDO1270B 
22.18 2.875e-38 94-131 
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SEQ ID WO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3 . 7Q0e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412C 10.28 9.24le- 
10 264-298 


38 


OXJ\J U^l£ 


ncuiomouuiin \ Jr ^ J j 

proteins. 


10 264-298 


39 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 




JkrrCU VJ J o V 


is. J. IN ilia J. N Ha AM i K,Hf\±N 

SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-l3 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5 . 154e- 
12 143-165 


44 


BLO0345 


Ets-domain proteins. 


BL00345B 21.28 1 . OOOe- 
40 239-290 BL00345A 
13.96 2.452e-l4 204- 
223 


45 


tit* nni ac 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.571e-17 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


PRO0876 


NEMATODE METALiLOTH I ONE I N 
SIGNATURE 


PR0087^B 7.66" $.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-l0 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP -binding nuclear 
piutein rail p^roceins • 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PRQ0988C 13.64 

PR0098BE 8.27 3.872e- 
11 174-186 PRO0988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.9l5e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


jPR00762C 9.29 4.*82e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO : 


ACCESSION 
NO. 


DESCRX PTION 


RESULTS* 








PR00762F 15.12 3,100e- 
16 D&3-5B3 PRQQ762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 

. / uQC 13 D*A — -3 v 

PR00762G 14.13 6.276e- 
13 601-616 


56 




proteins . 


RT.flO? IfiR 5 7 fid ft ftonV- 
DXj UW£ loo ^ / .04 o.owue 

10 153-203 


58 




UkjUIcx Xll pxCbcllL it* Z.VJ-X 

and Unc5-llke netrin 


10 1080-1135 


59 


PF00791 


Domain present in ZO-1 

an/4 T7n<~»ts— 1 i Vr#» nf>(- vi n 

receptors. 


PF00791B 28.49 2.049e- 

XU lUb£-J,ll f 


61 




ANTIBIOTIC TRANSFERASE 
AM . 


PD01929t!> 10.7© y.Oloe- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
oBO-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 




j?J? U uobl 


BTB (also known as BR- 
C/Ttk) domain probeins . 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 108-118 


73 




Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-156 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTI C DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 


8 1 




DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-l2 393- 
410 


83 


RT.n n "7 nR 


Prolyl endopeptidase 
family serine proteins. 


ct rvftnnon *>>i *l TO r 7« 

BLiOO/UBB 24.91 f,13/e- 
12 570-601 


84 


trtCU UU11 


PTnonUPPTTM TVOI? TTT 
r iDKUttGL A XJN IzFei xxx 

REPEAT SIGNATURE 


PR00014C 15.44 o , 043e- 
09 985-1004 


86 


PRflfl fi7fl 
f t\.U UD / « 


REGULATORY SUBUNIT 
SIGNATURE 


rKUOo/Bn 9 . 1 J l.J/ye- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12 19 8 65O^-09 264- 

279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 123-154 


96 


BL0O107 


Protein kinases ATP- 
bindincf region proteins. 


BL00107A 18.39 4.000e- 

Xv Al£"<tlJ 


97 


PR0O0B1 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR0O380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 S.SOOe- 
24 401-423 PR00380D 
9.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 S29-547 PR00380C 
13 .18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP- DEPENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.56 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12.57 6.294c- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.0l3e- 
12 43-83 


107 


DM01970 


0 Jew ZK632.12 YDR313C 
END0S0MAL III. 


DM01970B 8.£0 5.000e- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins. 


3L00191K 17.38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 S.BOOe- 
23 1S6-1B7 RT,0niO7R 
13.31 9.100e-14 225- 
241 


117 


"BL00214 


Cytosolic fatty-acid 
binding proteins. 


BLO0214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 8.5£0e- 
13 36-67 


119 


PR00S29 


GONADOTROPHS RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13 01 9 dnn«. 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8 . 902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


13 0 


PR0099D 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


"133 


BL00880 


Acyl-CoA-binding 
protein. 


BL00880 17.52 S.S7Se- 
26 72-122 


134 


BL0 003 0 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODUL I N SIGNATURE 


PR0021SC 13.98 6.779e- 
10 475-496 


"136 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14. 74 2.432e- 
29 71-107 


14 0 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL0OO28 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








BL00028 16.07 5.500e- 
13 74-91 BJjU0028 

16.07 9.100e-13 186- 
203 BL00028 16.07 

BL0002B 16.07 8.435e- 

■J. C XJ U X "* ' DJUUUUZu 

16.07 9.217e-12 270- 
287 BL00028 16.07 
6 3_92e-ll 242-2*59 
BL00028 16.07 4.000e- 
10 158-175 


141 


BL00501 


Signal peptidases I 
serine proteins . 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 


143 


BLO102O 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER , 
ZINC -FINGER METAL - 
BINDING nu. 


PD01066 19.43 6.400e- 
25 335-374 




dt n m o c 


3 1 5 ' -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1 . 450e- 
25 S09-550 BL00126E 
35.22 3 .951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


dt nncio 

obUUbJ A 


Ribosomal protein S4 
proteins . 


BL00632 23.79 5 . 271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL03559L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 
nnb SlutNAi UKIS 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2 . 547e- 
18 275-330 BL00406A 
9.9b i».776e-16 15-50 
BL00406B 5.47 7.429e- 

J. £, D «7 — ± Z *i DljUUlUoL 

6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc -binding region 1 
proteins . 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL00362 


Rihofiomai rvrnf 55"* 
proteins . 


15 129-172 


169 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL0003 9D 21 67 1 OOftp- 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 1 


179 


PD01066 


PROTEAN ZINC FINGER 
ZINC- FINGER METAL - 


PD01066 19.43 9.455e- 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




180 


PRO 0007 


COMPLEMENT Cl£ DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-l9 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


181 


BL00027 


•Homeobox' domain 
proteins . 


BL00027 25.43 9.526e- 
24 280-323 


182 


BL00027 


1 Homeobox ' doma i n 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


• Honeobox 1 domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


•Homeobox* domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


188 


PRO0929 


AT- HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.1B8e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
*j« si a. f £ u e — u j/x — 
387 BL00383C 10.10 
3.000e-13 217-228 

a. J_ > j £La f . UUU" 

13 295-308 BL00383B 
7 61 1 S92e-1 1 107.10/; 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PF005^4 


Octicosapeptide repeat 
proteins. 


PF00564B 24.74 6.164e- 
16 227-278 


194 


PR00S03 


BROXODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.57le-13 170-187 


195 


'BL00901 


Cysteine 

synthase/cystathionine 
beta -synthase P- 
phosphate att. 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL0063 6 


Nt-dnad* domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 " ■ 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


"206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


1 RESULTS* 








4.833e-18 143-1SS 
PR00261D 12.47 7.500e- 
18 143-1.65 PRnft^fiin 
14.12 5.065e-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 

165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


rfuu/jiD «o.4±7 b.l43e — 
13 118-173 PF00791C 
0C\ 7 cflno to i ■> o 

*U • ?C / . OOUc-JLU lj<6 — 

171 


2X1 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


UJbicrul t in - coni uaa t* -i na 
enzymes proteins. 


xjijuu J.OJ 40.7/ 1 , 545e- 
30 43-91 


213 


BLO0183 


Ubiqui tin- conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 


215 


BL00039 


DEAD -box subfamily ATP- 
denendent liel i r«?i«i<<»<a 
proteins . 


BL00039D 21.67 1.900e- 

aoo-614 BL00039A 
18.44 1.871e-23 21-60 

11 354-388 BLOO039B 

19 19 4 Qg4p-1 1 ")77 

303 


217 


BL00100 


Chloramphenicol 
a ce tyl t rane f erase 
proteins. 


BL00100D 17 9!? ft ifliio 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 
S I GNAT ORE 


FR002i3C 15 <iO. ~i qcq 0 " ~~ 
11 199-227 


222 


BL00678 


Trp-Asp (WD.) repeat 
proteins proteins. 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 l.OOOe-" 
09 901-913 


22S 


BL00636 


Nt-dnajr domain proteins. 


BL00636B 15.11 8.200e-' 
19 18-39 


226 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 l.OOOe- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 


PRO 03 01 


70 KD HEAT SHOCK PROTEIN ' 
SIGNATURE 


PR00301F 1"? 9 ft 7 q k •«;*»- 
13 329-346 PR00301G " 
13.78 4 300e-12 3G1- 
382 


230 


BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7 429e-16 7R-QC 
BL00460C 14.35 2.831e- 
12 111-134 BLO046OD 
16.89 8.773e-ll 140- 
160 


231 
~233 


FR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 




BL00292 


Cyclxns proteins. 


BL00292B 20.31 7.429e- " 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 1 


3R00449 : 
I 


rRANSFORMING PROTEIN P21 ") 
iAS SIGNATURE 


PR00449A 13.20 6.308e- 
L3 7-29 PR00449C 
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SEQ ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 

11.36 5.320e-09 119- 
133 PR0OO3 9"R 11 Tfi 

1.000e-08 229-243 


236 


PR00019 


LEUCINE -RiCH REPEAL 

SIGNATURE 


PR00019B 11 "Ifi 7 ^ftfirk 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1 000e-08 223-?'* 7 


237 


PD00289 


PROTEIN SH3 DOMAIN 
RE PS AT PRESYNA. 


PD00289 9.97 8.448e-09 
67-81 


240 


PRO 0 011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 3.492e- 
10 616-635 


241 


PRO 0 Oil 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidir.e and 
deoxy cy t i dyl ate 
deaminases zinc -binding 
region s. 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00173 


UI AT DU TV TiTMlPPT/MkT 

T-CELL. 


DM00179 13.97 8.043e~ 
09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4 .176e-3o 105-140 
BL00246A 15.75 2.286e- 

15.56 4.857e-22 150- 
175 


2S0 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 


BL00G74 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 6.045e- 
09 61-88 


255 


BL50002 "' 


domain proteins profile. 


BJ_ib0002B 15.18 2 . 800e- 
10 421-435 


258 


PRO 0094 


ADENYLATE KTMA^F — 

SIGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500Q-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11 25 7 311p-n 170. 

193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- " 
13 60-91 


262 


BL00388 


Proteasome A- type 
oubunits proteins. 


TiT.ft fl *3 Q 0 TV ""J "3 1 A "1 Aftft^. 

djjuujoca «i .14 l.oooe- 

40 8-54 BL00388B 

71 -ao o P<rA*»_oo <rc 1 no 

BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
14 8 


264 


BL00903 


Cytidine and 
deoxycytidylate 
deaminases zinc-binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 


BL00226 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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oc*U Xu jmvv: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6 . 143e- 
15 96-111 


271 


PD02952 


KINASE TRANSFERASE 

V— il^lJ J. IN tli V M\J 1 C<lrJ 

MULTIGENE FAMI. 


PD02952C 15.76 9.731e- 
_b 23b-^o5 PDQ2952B 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 

DPPf'l lUgfiD T 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-l7 179- 
199 


274 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins . 


BL00052A 27.85" 6 . OOOe- 
13 137-184 BL00052B 
15.17 S.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 
( TRANS DUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 l.OOOe-21 89-105 
PR00319A 15.27 8.364e- 
21 51-68 PR00319B 
11.47 8.200e-l9 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR0031SD 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuc lease. 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I. 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16 .07 5.500e- 
15 322-339 BL00028 % 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
I .b^ze- XX O93-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 S.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.6S4e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 B"LO0fl3fl 
16.07 7.429e-09 4B9- 
506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 B.333e- 
16 111-136 BL00215A 

BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.37Se-10 S9-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PP00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 

19.68 5.000e-25 102- 
129 PF00953B 6.17 

— * • \J \J \J ti .1 j XO« X p7fv 


304 


PP00152 


tRNA synthetases class 
II . 


PF00152D 21.30 8.364e- 
28.03 9.250e-2l 220- 

257 PFOQ152R 1*5 fi"7 
2.6S8e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN 2 INC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8,250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


308 


PR00237 


KnUUU roXW ljl tv-tli f\— i<. 

SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.09le- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 

13.50 9.438e-10 57-79 


309 


BL00522 


DNA polymerase family x 
proteins . 


24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19.63 8.615e- 
14 430-460 BL0052,2B 
27.30 9.625e-12 267- 
313 




310 


BL00326 


Tropomyosins proteins . 


BL00326D 8.76 5.235e- 
10 856-897 




312 


BL.00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20-89 4.706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 




313 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 




315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF0065I 15.00 5.09-le- 
15 63-76 




317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 




318 


BL.00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 




320 


PR00109 


TYROSINE KINASE 

CATALYTIC DOMAIN j 


PR00109B 12.27 4.814e- 
10 216-235 
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SBQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


'Homeobox' domain 
proteins . 


BLO0O27 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-5S6 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32-79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00S98 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins . 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.6S 7.167e- 

X\f <t- XZf DLiUJLUJ.br 

13.34 1.563e-09 200- 
212 BL01016B 8.93 

O • O J JC~UJ JO — J \J 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.500e- 

-t-L -L / O J. 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


Mi 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- " 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
S IGNATURE 


PR00109B 12-27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SEO. ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
S IGNATURE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN 1 ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
• 10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 " 


ttT n n i q n 


Rhodanese proteins . 


BL00380F 9.76 6.694e- 
11 542-553 


355 


PP00628 


PHD- finger . 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PRO 05 8 7 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL -HINDI . 


PD00066 13.92 4.4^2e- 
15 261-274 PD0006S 
13 . 92 6 .500e-13 233- 
246 PD00066 13.92 
4.300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like net r in 
receptors . 


PF00791B 28.49 9.604e- " 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and Ones -like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 S.OSOe- 
10 73-95 PR004S0C 
12 .22 3 .278e-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- " 
09 22-68 


365 


PF06242 


DNA polymerase (viral) 
N- terminal domain 
proteins , 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19-54 6.644e- 
09 1O3B-1092 


367 


rau u U J. jj 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


366 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PR00011C 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PDO 10 66 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL £ IGNATURE 


PR00170E £.48 2.739e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 88-118 


380 


" BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 1 . OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 


BL00455 


Putative AMP -binding 
domain proteins. 


BL00455 13.31 5.714e- 
12 50-66 


382 


PR00624 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4 . 900e- 
09 524-544 


384 


PDO0078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13. 14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


38S 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


385 


PD02870 


RECEPTOR .TNTERLEUKIN-l 
PRECURSOR . 


PD02870B 18.83 £.000e- 
10 97-130 


383 


PD00066 


PROTEIN ZINC- FINGER 
METAL -DINDI. 


PD00066 13.92 5.000e- 
13 516-529 


383 


BL00290 


Immunoglobulins and 
major histocompatibility 
(.uiupicA proueins . 


BL00290A 20.89 7.657e- 
09 151-174 


390 


BL00215 


Mitochondrial energy- 
transfer proteins . 


BL00215A 15.82 5.200©- 
15 221-246 BLQ021SA 
15.82 7.6l8e-14 20-45 
BliUL)21SA 15. 82 8.851e- 
11 123-148 BL00215B 

1 f\ A a Q c t /-a i -i r- e\ o«"> 
y 69-82 

BL00215B 10.44 7.300e- 

10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FTNfiFR 
SIGNATURE 


11 141-155 


398 


PR00761 


BIND IN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00676B 74.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
bUo FtOU67oC 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinooen bf»t-a anrt 

gamma chains C- terminal 
domain proteins. 


DiiUUsX^v. J. / . 11 4.6 /3e— 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL0DS3ilW 
14. 9S 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. " 


PF00992A 16.67 5.974e- 
09 105-140 


404 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 B.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


40* 


BL00232 


Cadnerins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- " 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








294 BL00232B 32.79 
9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.6s 
7.26le-ll 461-479 

11 27-45 


407 


PFOQ42 6 


Outer Caps id protein VP4 
(Hemagglutinin) . 


PFQQ426S Xd.dV 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine - nucl eot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BLO0603 


Thymidine kinase 
cellular- type proteins. 


BL006O3B 11.39 8.5Q0e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23 .26 9.000e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- ' 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.955e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PFO0791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.88le- 
10 228-251 


Ana 


BLi0 0518 


Zinc linger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 

*o a A tot r\ r\ r\ *o ro n «% /\ 

244 BL00039B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR004S2 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR00828B 5.23 B.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






pl5. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PROOS^S 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5.551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
(SCR repeat proteins. 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- ~ 
09 618-649 


456 


PR0038C 


KIN3SIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.OOOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 S.950e- 
21 452-473 


467 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 8.200e-12 ' 
33-44 


472 


BL0022£ 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.72le- 
09 282-330 


473 


BL00344 


GATA-type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.07 8.909e- 
09 173-199 


479 


PR00319 


BETA G- PROTEIN 
(TRANS DUC IN) SIGNATURE 


PR0O319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19 at l nnno. 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR00405A 17.71 
4.971e-18 411-431 


482 


PR00049 


WILM^ TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0 00 9 857e-10 9i;S-.Q*7 , 5 
PR00049D 0.00 1.3 05e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8 . 6l5e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00S67 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e- 
09 200-214 


488 




URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 

497 J 


BL01128 

] 

PF00429 


ShiJcirnate kinase 
proteins . 

SNV polyprotein (coat 


BL01128A 18.84 6.464e- ' 
17 58-92 

FF00429 31.08 7.171e- 
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SEQ ID NO: 


ACCBSSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins. 


BL00120B 11.37 ?.923e- " 
09 185-200 


500 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


WW/rsp5/wwP domain 
proteins . 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL0002XB 13.33 3.739e- 
17 492-510 


508 


PR00120 


H+TRANS PORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR0012OC 9.90 5.800e- 
19 705-722 


509 


DM014 11 m 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7 . 04 . 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00S34 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group 1. 


PF00S34B 14.47 6.625e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1. 


PF00S34B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD0184U 14.94 
6.023e-35 851-888 
PD01841H 21.30 2 . 909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 j 
9.3B6e-23 222-243 

21 1054-1073 PD01841I 

591 


514 


PRO 0153 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR001S3E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.18Se- 
12 410-423 


516 


DiM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C 16.86 8.320e- ' 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION.- 


mIIV Jin -L o • o yJ J • / o \J Gi ~ 

39 20-68 DM00031B 
15.41 1.000e-25 84-11B 


525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8-375e- ' 
10 61-95 


526 


PF00789 


Domain present in 
ubigu it in -regulatory 
proteins . 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 S.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 1.500e- 
16 120-164 
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SEO T D MO - 


NO. 




RESULTS* 


529 


PRO 0910 


LUTEO VIRUS 0RP6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


532 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
148 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl-enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BL00098P 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN- CONTAINING 
MONO OX YGENAS E (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370P 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C j 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BLO0028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.429e- ~~ 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00O28 16.07 
1.346e-ll 369-386 
BL0C028 16.07 1.692e- 
11 397-414 BL00028 
16.07 4.4S2e-ll 453- 
470 BL00Q28 16.07 
7.231e-ll 425-442 j 
BL00028 16.07 4.300e- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 




wHiiF- ii<£> domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR0O985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


DJjU UUZo 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 1^.07 l.OOOe- 
10 48-65 BL00O28 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BIj002 50 


TGF-beta family 
proteins. 


BL00250A 21.24 B.OOOe- 
31 293-329 BL00250B 
27.37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 186-201 PR00319A 
227 


548 


BL01204 


NF- kappa -B/Rel /dorsal 
domain proteins. 


40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


549 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 


551 


PF00632 


HECT-domain (ubiquitin- 
transf erase) . 


PF00632C 20.66 3.302e- 

?t 1 ^fvO^I DT?rtr*»(C"a "5X1 
£• ^ XjOJ iOUi rf UUo j/!d 

18.45 3.700e-21 1515- 
1543 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


oi)Uu« y ud X.3 . J. / i . buus* 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 6.339e- 

09 846 -B7Q 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Polv-adenvlahe binriino 

protein, unique domain 
proteins . 


rtUUOSbl. lb - jj 3.455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 

ZINC- FINGER METAL- 
BINDING NO. 


f UUlUbb 19.43 4.97/Q- 
13 229-268 


569 


BL0O107 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 S.S00e-15 183- 
199 


570 


BL0O107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
JL-J.JO. 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19 47 6 559e-19 50R- 
537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


rKVUJL7JU 14 . JO X.Ol>/e — 

34 470-499 PR00193C 

267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


13L00752B 19.17 9.703e- 
10 885-925 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-2S5 


"577 


BL00116 


DNA polymerase family B 


^00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCR I PTt on 


RESULTS* 






proteins. 


13 864-877 BL00116B 

XX. X.D^je— Xx i?3x — 

965 


578 


BL00195 


Glutaredoxin proteins . 


BL00195B 15.31 7 . 158e- 
09 121-141 


579 


PR00019 


LEUCINE - R ICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.350e-09 386- 
400 FR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PRO 0253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
S IGNATURE 

i 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
S.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPER FAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-li 783-802 
PR00343C 16. B5 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e~10 1686- 
1705 | 


584 


DM0153 7 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15,14 
3.196e-ll 784-804 


586 


PFC0013 


ivn uomain procems 
family of RNA binding 
proteina . 


PF00013 5.78 1.450e-09 
124-136 


587 


"DM00832 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.409e- 
it. o co o ar 


589 


BL00478 


LIM domain proteins. 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13. 7S 8.000e- 
15 931-948 


"*91 


PF00855 


PWWP domain proteins . 


frOuebS 13.75 B.GQOe- 
15 1062-1079 


593 


PF00628 


PHD-f inger . 


rruuo^o lb . B4 3.455e- 
12 424-439 


594 


PR00205 


CADHERIN SIGNATURE 


vonnoncm n — =Tq~ a — a* v~ 

fKUU-tUDB 11 .i9 2.24le- 

16 558-576 PR00205A 
14.73 9.300e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 


600 


BL00242 


Integrins alpha chain 
proteins. 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.1l5e-26 286- 
316 < BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 



210 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








5.000e-ll 61-73 
BL00242D 13 57 4 qnfip- 
10 291-316 


601 


' PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 5.610e- 
09 198-213 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.569e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


M I TOCHONDR I All CARRIER 
PROTEIN SIGNATURE 


PR00326F 17.75 l.OOOe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5 . I67e- 
15 265-282 


609 


PP00855 


PWWP domain proteins . 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8,291e-09 767-787 


615 


PD02699 


PROTEIN DNA-B INDING 

▼J T\ITlTWr» niTR 

fci JUNDXNCj DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
l.OOOe-17 158-182 


616 


PRO 03 80 


KINESIN HEAVY CHAIN | 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455' 


617 


PR00380 


SIGNATURE 


rKUUJtJUA 14. IB 4 . 086G- 
22 288-310 PRO0380D 

j.Jj J. /£J.e— ±/ Sob-bUo 

PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 


618 


DM012O6 


CORONAVIRUS NUCLEOCAPSID ' 
PROTEIN . 


DM01206B 10.69 5.143e- 

10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 3 . 160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine Kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC MOLYBDO PTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain nadh 
dehydrogenase 75 Kd 


BL00641C 21.10 1 . 000e- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins . 


24.37 1.000e-40 255- 
308 BL00641P 33.12 
1.000e-40 571-623 
BLO0641A 17.15 1.818e- 
37 48-80 BIi00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


627 


PR00103 


CAMP -DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PRO 00 81 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6.2lle- 
16 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk> domain proteins. 


PF00651 15.00 B.SOOe- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


635 


3L00107 


Protein kinases ATP- 
bdnding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins . 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


'643 


BT.OD01 H 


tr-nana caj.cxum-Dxnaj.ng 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


64 7 


PFO0628 




PF00628 15.84 2.350e- 
13 385-400 PP00628 
lb . 84 3.455e-12 464- 
479 


648 


BL01129 


yabO/yceC/sfhB family- 
proteins. 


BLiG1129b 13.25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BIj01129B 12.51 
5.118e-13 191-212 


649 


BU01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


1 Homeobox' domain 
proteins. 


BL00027 26.43 6.684e- 
13 771-814 


651 " - 


BIi50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL5C002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR002S3B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


| PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-59S 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-S85 
DM00215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.51Se- 
09 224-236 


661 


BL00027 


• Homeobox 1 domain 
proteins . 


BL00027 26.43 5.9S0e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


666 


PRO0819 


CEXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.9QGe- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


66B 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PRO06191B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BIjOOOIB 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 l.OOOe- 
34 3S6-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7. 557e- 
10 106-123 


674 


PRO0320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.2S0e-09 593-608 


675 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.ll5e-12 614- 
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cj?n m mo. 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13 .01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-62 9 PRO 03 2 0C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
J . 250e-0S 572-587 


676 


PRO 0 0 19 


LEUCINE-RICH REPEAT 

Q & TTTO 1? 
0 UK IS 


PR00019A 11. IS 9.667e-~ 
09 249-263 


679 


' PF00642 


Zinc finger C-x8-C-x5-C- 
xj-.h cype tana similar) . 


PF00642 11.59 3.700e- " 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actlnin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.200e- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


687 


PRO 004 9 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.50Oe- 
10 538-553 


689 


BL01024 


Protein phospnatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26 l.OOde- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-18S BL01024D 
13.22 1.000e-40 185- 
222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 




r Homeobox ' domain 
proteins. 


BL00027 26.43 B.071e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


nliOOb 8 0 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL00680 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC 2 4 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 8.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PRO 004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 

■ J* O .O/Uc" 1U 1JJ - 

147 PR0004 8A 10.52 
S.826e-1Q 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 " - 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- " 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 S.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F j 
10.85 6.351e-09 413- j 
424 j 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 ! 
PR00048B S.02 1.474e- 
09 364-374 | 


707 


rUU U /B 7 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007G7A 14.84 8.941e- ' 
14 66-82 


708 


PR00761 


BIND IN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- | 
10 822-841 1 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 

• 


DM01354Y 10.69 4 . 977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 j 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 J 


713 


BL0 0039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 7.54Se- 
27 450-496 BL00039A j 
18.44 2.537e-18 147- | 
186 BL00039C 15.63 ! 
2.216e-14 280-304 
BL00039B 19.19 1.947s- j 
13 194-220 J 


715 


BL003 83 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 j 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 1 
15.41 2.688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 J 


719 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- j 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- J 
40 314-358 BL00243I 
31.77 6.571e-39 607- j 
650 BL00243E 16.70 \ 
3 .077e-35 274-304 . j 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 j 
BL00243H 17.53 7.167e- ! 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 

5.304e-ll 606-632 
BL00243I 31.77 1.380e- ! 
09 610-653 J 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 B.022e- "I 
09 20-36 1 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- ] 
34 135-161 PR00704F 
13.61 7.000e-26 190- I 
218 PR00704E 12.55 I 
8.071e-26 165-189 f 
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SEQ ID NO: 


ACCESSION 
NO. 


XJ&O \,t\ X c x Jiun 










PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 [ 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 

Vy 109-1B7 


727 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 2.125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR0032OC 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6 . 586e- 
11 323-338 PR00320B 
12.19 4 .343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PR06i$£ 


DYNAMIN SIGNATURE 


PR00195A 11.94 8 . 627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


738 


TIT.rtftn** o 
oJjUUU J y 


DEAD-box subfamily ATP- 
dependent helicasee 
proteins . 


BL00039A 18.44 2.565e- 
28 26-6S BL00039D 
21. 67 2 .10Se-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


Dilu i£0 J 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
3 83 


742 


BL01019 


ADP-ribosylation factors 

tamliy procelas . 


BL01019A 13.20 7.076e- 
12 41-81 


743 


BL00965 


Phosphoniannose isomerase 
type I proteins . 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-2S 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BIj00021 


Kt"! riorl «=» rtnmn i n rvvrtt* a4 rtc= 

xvxj.nyj.ti uumcixji proLcins. 


oIj0 0021D 24.56 4.563e- 
25 231-273 BL00021B 

IT 11 C 1ACa "51 tz r\ to 

■ij . JJ ^ . J^se-^X 60-78 


748 


BL00612' 


Osteonectin domain 
proteins . 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 

JLIJ 103-J.D/ 


752 


BIi00795 


Involucrin proteins . 


BL00795C 17.06 6.000e- 
11 5fl4-45<5 nT.ori*7Q^r» 

JO^t O Lt\J U / J JL 

17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein L3 9e 
proteins. 


EL000S1 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 Jew ZK632.12 YDR313C j 
ENDOSOMAL III. i 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 15.35 9. 020e- 
12 99-150 


762 


3L0004 6 


Histone H2A proteins. 


BL60046" 12.95 l.OOOe- 
40 33-88 


7*3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


3L00027 


■ Home obox • doma i zi 
proteins. 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.063e- 
10 309-324 BL01208B 
15.83 8.03ie-10 165- 
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SEQ ID NO: 


ACCESSION 
. NO. 


DESCRIPTION 


RESULTS* 








180 BL01208B 15.83 
4.l62e-09 85-100 


770 


BL00031 


Nuclear hormones 
receptors DNA-binding 
region proteins. 


BL00031A 19.55 9.571e- 
32 «208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PRO0449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 


773 


' BL00523 


Sulfatases proteins. 


BL00523E 19.27 9.333e- 

13.36 2.200e-13 47-64 
BL0052^B S 64. ~> Pin to. 
13 91-103 BJ.00S23D 
9.89 7 923e-12 
BLC0523C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 568-585 


776 


BL00028 


Zinc fxnger, C2H2 type, 
domain proteins , 


BL00028 16.07 7.686e- 
09 621-638 


777 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL0O030 


EuKaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
1^.07 / « uuue-io 220- 
239 


779 


PR00079 


GLUCOSE - 6 - PHOSPHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PRO0Q79C 8.68 
6.351e-16 246-264 
rKuuu i -) lj ij . ai f , \j /Ue — 
16 264-281 PR00079A 
16 19 6 7GQf»-"i"* i ezQ 

X © . J. £, O. /D^C - J, J ±Oj7 — 

183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00209 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BLQQ690 


DBAH-box subfamily ATP- 
dependent heli cases 
proteins. 


BL00690B 13.38 l.OOOe- 
12 147-165 BLQOfiqna 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


788 


DM01206 


^ORONAVIRUS NUCLEOCAPSID ] 
PROTEIN. 


DM01206B 10.6"£ 8.767e-~' 
L0 1-21 


790 ] 


iL00 915 ] 


Phosphatidyl inositol 3- J 
ind 4-kinases proteins. 


3L00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 

/ 


GLIADIN AND LMW GLUTEN IN 
SUPERFAMILY SIGNATURE 


PR0020BA 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR0020BA 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodulin (GAP -43) 
fctz. utcins . 


BL00412D 16.54 4.000c- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 1.827e- 
09 195-246 BL00412D 
16,54 1.9l8e-09 194- 

2.102e-09 201-2S2 


797 


BL00021 


Kringle domain proteins . 


Biinoo^is i k i-jq 0 

DUUUU41D XJ • JJ b . £ j zfG 
13 40-58 


799 


Bt,01052 


proteins. 


BLQIO^C? 1 fl — t — nnn a 

40 87-127 BL01052A 
16.12 1 529e-32 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
194 


800 


BIi0d348 


p53 tumor antigen 
proteins . 


BL00348F 23.19 3 . 714e- 
09 197-240 


801 


BL00309 


Vertebrate galactoside- 
binding lectin proteins. 


BL00309C 18.65 1 . 621e- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyridine 
sensitive L-type calcium 
channel {Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PRO 06 67 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


BIO 


PD0234$ 


PHOTOS YS TEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS. 




Bll 


BLO0S85 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR0008O 


j ALCOHOL DEHYDROGENASE 
j SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Hi stone H2B proteins. 


BL003S7 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC-FINGER 
METAL-BINDI. 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
14 18-31 PD00066 
13.92 7.000e-13 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD0006S 
13 . 92 4 .429e-12 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleukin-10 family 
proteins. 


BLO052OA 6-21 6.471e- 
09 1-14 


822 


BL00972 


Ubiqui'tin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE MBTALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVOPROTEIN PROTEIN 
DNA/ PANTOTHEN . 


PD02855A 18.37 4.732c- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


830 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PROO019B 
11.36 1.720e-09 136- 
150 PR00O19B 11.36 
3.8B0e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE j 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 


PD00306A 10.26 7.0O0e- 
Xd. 216-230 


837 


DM00215 


PROLINE- RICH PROTEIN 3 . 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR | 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 8.302e- 
09 73-116 


846 


PRO 07 00 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13-17 
4.750e-14 449-467 
PR00700F 11.18 8.5O0e- 
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1 csro Tn Krn- 

flu ; 


AC- Cab is XCJlN 
NO. 


DESCRIPTION 


RESULTS* 








11 S38-549 PR00700E 
17.57 3.100e-l0 522- 
53 8 


841 


PR0O109 


TYROSINE KINASE 

PITAT.VTTP TVMUfJVTivT 

SIGNATURE 


PR00109B 12.27 5.404a- " 
13 134-153 


844 




JfKUlJC*XiM KloOSOMAIj 60S 
L22 RNA-BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
lb . £ J l.yibe-^8 8-57 


845 


"BLC0826 


MARCKS family proteins. 


BL00826C 7.63 6.738e- 
09 203-230 


846 


BL0O518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PRO03O8A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22,67 1.321e-38 933- 
388 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9-625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 S.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 S.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
S.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BLOO420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.73le- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 

a t *y nt r\ r\ a *~i r\T* 

4,5^ BJjUQ4ZQB 22.67 

2.800e-15 863-918 

BL00420C 11.90 1.900e- 

13 355-366 BL00420C 

11.90 1.900e-12 841- 

852 BL00420C 11.90 

3.550e-12 248-2S9 

BLO0420C 11.90 2.83le- 

11 141-152 BLO042OC 

11.90 5.119e-ll 1051- 

1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








7.95Se-10 567-578 


857 


PR00388 


3 * , 5 1 -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


8S9 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BLOO030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
XQ 128-147 


861 


PR0098B 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-l6 107- 
123 PR00988P 12.23 
7.828e-15 198-212 
PR00988E 8,27 9 . 769e- 
12 176-188 PR00988D 
S.95 8.250e-ll 163-174 
PR00988B 11.60 4 . 512e- 
10 60-72 


"863 " 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


OCX 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR0077SA 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
D.769e-14 249-267 


866 


DM01688 


2 POLY-IG RECEPTOR . 


DM01688G 16.45 9.460e- 
09 B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 

RTNTJTrJf? NTT 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL01287 


RNA 3 ' -terminal 

^liiuapuaLC Y — l c\ c3 

proteins . 


BL012B7A 17.95 2.688e- 
ad lb —Ho 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 


872 


BL0004 6 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biot in— reguiring enzymes 
attachment site 
proteins . 


RT.nmnR ^ n *>q q m£c» 
32 665-711 


876 


BLC0 02 8 


domain proteins. 


DUUUUZO JL O . U / >.D006- 

09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 

VACUOLAR ATP SYNTHASE 
HYDR0L. 


PD021O2A 1^ 7i 4 T-fZZZ 

10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-4O 71-125 


882 


BL00284 1 




ULiO0284C 2B.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOS PHATIDYL INOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL0O039 


DEAD -box subfamily ATP- 
dependent hell cases 
proteins . 


BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 

11 236-260 


901 


PD00066 


PROTEIN ZINC- FINGER 


PD00066 13.92 8,200e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-l6 310-323 
PD00066 13.92 8.2O0e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
ti.20Oe-14 338-351 


902 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 9 . 321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00B06B 4.28 9.160e- 
09 97-111 


904 


PR0O381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.S86e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.S5 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.18le- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-O9 310- 
329 


906 


PR00345 


SIGNATURE 


PR00345C 4.54 B.557e- 
09 525-549 


907 


PR00345 


SIGNATURE 


FKUUJftbv. 4 . i>4 o.557e- 
09 513-537 


908 


BL0O678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


ZINC- FINGER METAL- 
BINDING NU. 


fUUlObb 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins . 


BL01104C 15.14 6.000e- 
09 364-392 j 


922 


3L0Q678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09" " 
500-511 


923 


PR00320 


G- PROTEIN BETA WD-4 0 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-O9 187- 
202 


924 


PD02181 


PROT0CHL0ROPHYLLIDB 
REDUCTASE PHOTOS YNT , 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00 019 


Act inin- type actin- 
binding domain proteins. 


BL00019C 14. 6£ 7.4$3e- 
25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


I BL00678 | Trp-Asp (WD) repeat 


3L00678 9:67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 

360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL0051B 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulose-phosphate 3- 
epimerase family 
proteins . 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 

ELiUlQobE 18 . B7 8 . 676e- 
20 172-202 BL01085C 

Ol PT O m O a 1 A CC Q"7 


931 


BL01085 


Ribulose -phosphate 3- 

£»r* "5 mc co f ami 1 \r 
cyjLlue J. dfciti CrtlTHJ-y 

proteins. 


BLC1085D 16.55 4.600e- 
^4 Ib^-1B3 BL01035B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 

Oft 1 O ft _ O O n DTmrtOCf 

21.81 2.038e-14 66-97 


933 


PD00301 


P^OTHIM RP*PT?UT fvfTTCf^T.T? 

CALCIUM- B I . 


nnnn t m a i r\ ^ a c a nn » 
fc'UUUJVIA 1U.^4 o . 4U0e- 

09 160-171 


936 


PF60168 


C2 domain proteins. 


PF00168C 27.49 4,000e- 
Xc JJo-J 6 2 


93 7 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9 . 519e- 
10 5-4 9 


94 0 


PR00862 


PROLYL OLIGOPEPTIDASE 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


94 5 


BL01230 

XJXJ v X6 J v 


RNA methyl transferase 
trmA family proteins . 


BL01230B 11.62 2.373e- 
09 407-420 


948 


BL00479 


trlivji. UUi caLBrS / 

diacylglycerol binding 
domain proteins . 


oJj0047?B 12.57 7.429e«- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


94 9 


BL0067B 


proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


"PD01311 


NAD INTERGBNIC RE. 


PD01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


oxn Known as dK* 
C/Ttk) domain proteins. 


Pr00o51 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also known a 3 BR- 

v./ j. u a. / uuinain proteins , 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

tihoaohatidvl tran^fprAeoe 
proteins . 


BL00379 24.^4 l.SiOe- 


959 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL0111SA 10.22 3.438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/ reductase 
s family proteins . 


BL00061B 25.79 6.586e- 
13 19B-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
ii o i n oo c 


966 


"PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.286e-" 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-l8 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonuclease PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL01159 


WW/rsp5/WWP domain 


BL01159 13.85 3.605e- 
±Z liO-145 BL01159 
13.85 4 .122e-10 171- 

186 ; 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


" BL011S7 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 8.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.2S0e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.28Se-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 8.816e- 
09 414-449 


982 


PRO 02 9 9 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


QflC 

7<JO 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
±/.{Jb /.400e-09 11-56 
BL00795C 17.06 7,800e- 
09 3-48 


987 


3L0093 9 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR0O452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


BL00027 


' Homeobox 1 doma in 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL013 04 


ubiH/C0Q6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.7S0e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11-70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO- 


DESCRIPTION 


RESULTS * 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12 58 3 700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BLO0406 


Actins proteins. 


BL0O406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 

ATififldl OfxV ft &.A t nnria 

OJJUUIUOCi O . 'iH i.UUUS — 

35 248-298 BL00406A 
9.95 3.348e-29 11-46 


1007 


PRO03O4 


| TAILLESS COMPLEX 

\ POLYPEPTIDE 1 

! (CHAPERONE) SIGNATURE 


22 384-407 PR00304C 
8.69 4.667e-20 98-118 

rKUUJU^D XX. bU /,D77e'" 

19 68-87 PR00304A 

•s * £*\J JaJO«6~ 1Q *H)"oj 

PR00304E 7.79 6.870e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


r ±s \s ± \j \j « 

32 68-107 


1012 


BL0051B 


Zinc finger, C3HC4 fcyoe 
(RING finger) , proteins . 


OUUU JXU -L £ . C. J U ill Jc 

10 64-73 


1016 


PD0116B 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PDOHS8H OR i nnna 
11 174-194 


1016 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- 
32 261-3 02 PD0093 0A 
25 62 9 550e-25 1 c*7 _ 
183 


1022 


BL00175 


Phosphoglycerate mutase 
family phosphohistidine 
proteins. 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 . 75' 8 . 062e-10 79-111 


1025 


PR0Q305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- ' 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL00183 


Ubi qu i t in- con j uga t ing 
enzymes proteins. 


BL001B3 28.97 1.310e- 
33 43-91 


1033 


PFOO580 


UvrD/REP helxcase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALO AC ID T 
D EHALOGENAS E/ E POX IDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PDO1066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD0I066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEAN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


10S9 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143ft- 
20 56-78 PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


D3SCRIPTI0N 


RESULTS * 






SIGNATURE 


9.96 2.l54e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-l5 242-258 
fc , R00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 l-643e-ll 115- 
130 PRO0970E 11.23 

o<iUe-li ^Uz-Zlo 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 




m.nnci c: 
DLUUblD 


C- type lectin domain 
proteins. 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 




Adenylate cyclases 
class- I proteins. 


BL01092N 13.54 8.924e- 
IO 3-4 0 


1047 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins. 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


•DM00031B 15.41 7.618e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
40 12-62 


1054 


BL00S71 


Amidases proteins. 


BL00571 25.69 5.B75e- 
31 160-212 


10S5 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BLO003OA 14.39 5.235e- 
11 98-117 BLO003OB 
7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-i4 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BIi00027 


1 Home ob ox 1 doma in 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


'1064 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 6.211e- 
13 280-296 


1065 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


10 66 


rKuUj -c 0 


GlPx/ OBG GxP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
13. 09 X.257e-13 217- 
236 


1071 


PD02870 


RECEPTOR INTERLEUKIN-1 


PD02870B 18.83 8.518e- 

11 1 £ A 1 0*7 


1072 


PF00S56 


SET domain proteins. 


PF00G56A 26.14 5.976e- 
09 350-387 


1075 


BL01OG 9 


Extracellular proteins 

SCP /Tnv- 1 /Aa«5 /PR- 1 /^r*7 

proteins. 


BL01009D 14.19 4.300e- 
■jn tot 1 a a dt fti nnon 

13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.3l6e-09 
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SEQ ID NO: 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



1081" 



proteins proteans. 



298-309 



BL00326 
BJ.00460 



Tropomyosins proteins. 



BL00326A 14.01 7.398e- 
10 23-57 



1094 



Glutathione peroxidases" 
selenocysteine proteins. 



BL00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 



1095 



PD02811 



1096 



PD02811 



1097 



BL00479 



1105 



PF0O881 



PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 



PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 



PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 



Phorbol esters / 
diacylglycerol binding 
domain proteins. 
Nitroreductase family. 



PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-2l 111- 
144 PD02811C 13.25 
5.696e-13 147-160 
BL00479B 12.57 6.143e- 
09 200-216 



PF00881A 27.15 9*."229"e~ 
13 111-147 



1109 



PR00449 



1115 



PR00405 



BL00355 



1117 



1123 



BL00355 



BL00107 

"prdooTT" 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



HIV REV INTERACTING 
PROTEIN SIGNATURE 



PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 



HMG14 and HMG17 
proteins . 



PR00405B 11.83 5 . 737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 



BL00355 5.97 2.528e-25 
20-51 



HMG14 and HMG17 
proteins . 



BL00355 5.97 2.528e-25 
20-51 



Protein kinases ATP- 
binding region proteins. 



EPOXIDE HYDROLASE 
SIGNATURE 



BL00107B 13.31 4.857e- 
10 290-306 



PR00412F 18.76 9.526e- 
12 301-324 



1125 



PR00186 



1129 



BL00170 



HEMERYTHRIN SIGNATURE 



Cyclophilin-type 
peptidyl -prolyl cis- 
trans i some rase 
signatur. 



PR00186A 13.62 2.800e- 
09 87-101 



BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 



1131 



BL00636 



Nt-dnaJ domain proteins. 



BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 



1132 



BL00678 



1133 
1136 



BL00678 
BL00990 



1137 



PR00314 



Trp-Asp (WD) repeat 
proteins proteins. 



BL00678 9.6^7 6\211e-09 
29-40 



Trp-Asp (WD) repeat 
proteins proteins. 



Clathrin adaptor 
complexes medium chain 
proteins. 



BL00678 9.67 6.21le 
29-40 



-09 



CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 



BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 



PR00314B 15.68 S.OOOe- 
34 100-128 PRO0314D 
9.66 3.S31e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








32 159-188 PR00314A 
14.53 1.281e-22 13-34 


1139 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
19 451-4B2 BL00107B 
13.31 3.D77e-12 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR IIB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


10 522-574 PD01652B 
8.50 9 4G3e-10 74fi-7Q9 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 

13,93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 

i 


BL00623E 15.00 3.531e- 

Oft - did TIT AftC^IP 

10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 

U7 J J U J H x 


1162 


PD01937 


DNA PROTEIN POLYMERASE " 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475a- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 


1177 


BL01032 


Protein phosphatase 2C 
proteins 


BL01032G 8.33 1.422e- 
10 34-4 8 


1178 


PR00320 


G- PROTEIN BETA WD-4 0 
REPEAT SIGN ATT 


PR00320A 16.74 1 . 794e- 
lu PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 . 

PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.10Oe-O9 79-94 


1180 


PR00454 


iirrs DOMAIN SIGNATURE 


PR00454D 10.89 4 . lSOe- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. , 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BLC0215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A IS f\y 9 ft fl Qp - 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.76le-"' 
10 77-93 


1188 


BL0087B ""' " 


Orn / DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


oiiUup / oo iu . j 3 o . UUUe — 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL00S78D 16.56 1.621e- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PKO034 5 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR00345E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.10Oe- 
28 101-125 PR00345D 
jLV . i? / i«3b**e-24 125- 
149 PR00345A 13.46 

<x Pi A — 1 C A"3 C*> 
3. 0%3G*10 4j-Oa 


1194 


PR00345 


STATHMlN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800O- 

8.54 7.652e-28 185-210 

r jv_ *i . oft y.iuue- 

28 137-161 PR00345D 

10.97 1.964e-24 161- 

185 PRflO^^a I - * AC 
-too rtvuu ji _>>-\ 1J ,4b 

5.645e-16 79-98 


1195 


PF00995 


Seel familv 


rfvU??3D i. / . J / 1 , 120e- 

13 224-264 


1196 


BL00932 


Bacterial- type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 6.73Be- 
11 15-47 


1197 


BL01298 


Di hydrodipi col ina t e 
reductase proteins. 


BL01298A 13.90 5.959c- 
09 51-73 


1203 


BL00OS1 


Short-chain 

dehydrogenases/ reductase 
s family proteins. 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR0O118 


BETA- LACTAMASE CLASS S" " 


PR00118F 16.42 9.386e- 
09 213-229 


1206 


BL011B3 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins. 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.857e~ — 
11 49-65 PF00023B 
14.20 1.818e-09 45-55 


1212 


PR00048 ~~ 


SIGNATURE 


PR00048A 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PRO 04 50 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR004S0C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 1 


BL00412 


Neuromodul in (GAP- 4 3 ) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PR00 4 56 


RIBOSOMAL PROTEIN P2 

SIGNATURE 


fKUU4boJb i . Ub 5.348e- 
11 249-264 


1222 


PO00 066 


PROTEIN ZINC- FINGER 

METAL- BIND I . 


FUQUObo 13.92 7.231e- 
15 295-308 PD00066 

-L-5 . 3 Z. /.^Jle- 13 4U6- 

419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PDO0O66 
13.92 3.348e-ll 350- 
3 63 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- "" 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16. 2B l.O00e-40 114- 
168 BL00437C 21.86 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION " " 










1 QOOp-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


" BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 5-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR0073SA 11 19 6 857e- 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6 92 «? q q -i q_ 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
svnthase suihdonn« i n 

proteins. 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


' Homeobox • doma i n 
proteins . 


21 36-79 


1243 


PR00403 


* * ■ • wwj mill ulvli/ll UAD 


11 10-25 


1246 


PD01168 


SYNTHETASE T.TfiA^R 
PROTEIN ALANYL. 


nnn 1 1 COT. Q d*7 ~> on- 
rUUlXOOli J .*% t £ , o j /e— 

10 31-46 PD0116BL 

9.47 4.490e-10 174-189 

ruUilb oJj 3.4/ /.Ol4ae — 

10 183-198 


1249 


BL00018 


domain proteins. 


183-196 


1254 


BL001B3 


Ubimii fin- rnnH iicta Unn 
enzymes proteins . 


oiivvioj ^ o . y / ^ ,44Ue- 
36 96-144 


1255 


BL0111S 


GTP -binding nuclear 
protein ran proteins. 


BLOlliSA 10.22 5.670e- 
11 8-52 


1256 1 * 


BL00373 


Phosphor ibosylg ly c inamid 
proteins . 


BL00373C 10.35 3.348e- 


1258 


PR00011 


TYPE III EG F- LIKE 
SIGNATURE 


PROOOllB 13.08 3.217e- 
10 174-193 


1259 


BL00518 


Zxnc finger, C3IIC4 type 
vi\j.»v? linger/ , proceins . 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PR00070 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
ij - uy 9.D00e-i5 51-63 
PR00070A 12.92 S.SOOe- 
12 16-27 


1262 


BL00462 


Gamma - 

glu tamyl t ranspep t i dase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27 .41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-typc, • helix- loop- 
helix' dimerization 
domain proteins. 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- "" 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 2.714e- 

J. □ XO?"*J10 4 rKUUo J /A 

14.77 4.512e-12 86-105 
rnvuoj tu ii.i^ /.3//e— 
12 201-215 


3.269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9-308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PRO0449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins. 


BL00276A 8.87 1.500e- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9-769e- 
09 228-243 


1276 j 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3 .400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 S.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Pho spha t idyl etna no 1 amine 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


"1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16. Si 1.610e~ 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 26B-283 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.571e- 
28 82-126 BL00127B 
26. S7 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15. B2 5.500e- 
17 13-38 BL00215A 
15.82 I.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PRC0898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11 34 4 SB?p- 
09 552-572 


1309 


PD00301 


' PROTEIN REPEAT MUSCLE 
CALCIUM-BI . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins . 


BL00194 12.16! 1.900e- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosotnal protein L13 
proteins . 


BL00783C 22.43 6.559e- 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PFD0514 


Armadi 1 1 o/be t a - ca t eni n - 
like repeat proteins . 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BLO0O3 0 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR004 97A 6.92 7..239e- " 
09 25-43 


1332 


PR00l£l 


NICKEL - DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 

133 7 J 


PR00700 


PROTBIN TYROSINE 
PHOSPHATASE SIGNATURE 


fR00700D 12.47 2.200e- 
09 262-281 




PR00700 


PROTEIN TYROSINE 


PR00700D 12.47 2.200e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL012B2B 30.49 5.974e- ' 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 8.313e-' " 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5.935e- " 
10 135-146 


1348 


PF00££l 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


[ PR00193D 14.36 3.57le- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 

1 3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 

| 15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE - 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447B 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 

J PR0044 7G 6.69 9.877e- 

| 10 353-373 


1353 


BL00303 


5-100/ICaBP type calcium 
binding protein. 


BI.00303A 21.77 6.667e- " 
26 45-82 BL00303B 
26.15 l,000e-24 93-130 


1355 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21 67 R qc/ia. 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-i4 141- 
167 


1357 


PF006iS 


Regulator of G protein 
signalling domain 
proteins . 


PF00615B 16.25 2.216e- "" 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 




1360 


PD01066 


PROTEIN ZINC FINGER 

ZINC- FINGER METAL- | 

BINDING NU. j 


PD01066 19.43 9.234e- 
29 10-49 




1361 


PR0092S 


NONHISTONE CHROMOSOMAL T 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 




1362 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e-' 
30 136-171 BL01272C 
22.68 3.314e-25 249- 
274 BL01272A 6.49 
1.23le-18 99-117 




1363 


BL01272 " 


Glucokinase regulatory 1 
protein family proteins, t 


BL01272B 19.61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 




1364 


DM0O179 


w KINASE ALPHA ADHESION J 
r-CELL . J ( 


3M00179 13.97 5.304e- 
39 167-177 


1 - 


1368 

L370 ~ ] 


PR0O169 ] 

i 

fc»R00988 \ 


POTASSIUM CHANNEL f ] 
SIGNATURE ( 
JRIDINE KINASE SIGNATURE j I 


3 R00169A 16.77 1.592e- 
)9 76-96 

'KU0988A 6.39 1.794e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8 . 615e- 
09 469-479 


1372 


PRO 062 5 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR00625A 
12.84 I.39ie-16 14-34 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins.' 


BL00434C 23.85 3.778e- 
09 90-130 


1374 


PRO0962 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


13 75 


PD024 75 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE . 


PD02475A 23.18 8 . 552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


1380 


BIiO0194 


Thioredoxin family 
proteins . 


BL00194 12.16 8.333e~ 
12 48-61 


1381 


DMO1970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.6^7 7.600e-10 
271-282 


1385 


BL00303 


S-100/lCaBP type calcium 
binding protein . 


BL00303B 26.15 6.203e- 
10 95-132 


1386 


BL01160 


Kmesin light chain 
repeat proteins . 


BL01160B 19.54 S.042e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger) , proteins . 


BL00S18 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.512e- 

O X 3 4 — 1 X 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PRO038O 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13-18 6 q'tRp. i t: oil. 
262 


1394 


PD000 66 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 3.400e- 

13 .92 8. 800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 ■ 
PD00066 13.92 6.087e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 


1398 


PD01066 


PROTEIN 2INC FtNSER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryotic RNA- binding 
region RNP-l proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.S50e-~" 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR0O019B 11.36 4.960e- 
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SEQ ID NO: 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



1409 



1410 



1412 



1414 



PR00510 



PD00078 



BL00356 



BL00282 



BL00023 



NEBULIN SIGNATURE 



09 176-190 



REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 



Ribosoraal protein L5 
proteins. 



Kazal serine protease 
inhibitors family 
proteins . 



PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 



PD00078B 13.14 5.696e- 
09 31-44 



BL00358B 2 2.76 ^ l.OOCe" 
40 57-103 BL00358C 
13.75 6.087G-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.93le- 
11 33-44 



BL00282 16". 88 7.338e- 
10 511-534 



Type II fibronectin 
collagen-binding domain 
proteins . 



BL00023 24.31 4.300e- 
29 40-77 



1418 



1419 
1420 



PR00681 



DM00973 



PR00319 



PD01941 



RIBOSOMAL PROTEIN SI 
S IGNATURE 



3 Jew RESISTANCE BENOMYL" - 
YLL028W CYCLOHEXIMIDE . 



PR00S81G 12.54 2.149e- 
09 38-60 



BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 



DM00973A 21.17 1.462e- 
09 171-208 



TRANSMEMBRANE 
COTRANS PORTER SYMP. 



PR00319B 11.47 I.571e- 
09 428-443 



PD01941A 14.81 l.OOOe- 
40 142-196 PD019413 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.6l4e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 



1423 



1424 



1425 



1426 



1427 



1428 



1429 



143 0 



1431 



PRO 02 05 



PR00209 



BL50002 



PFO0628 



PP00628 



PR00405 



BLOO039 



CADHERIN SIGNATURB 



ALPHA/BETA GLIADIN 

FAMILY SIGNATURE 

Src homology 3 (SH3) 
domain proteins profile. 



PR00205B 11.39 8.043e- 
12 199-217 



PR00209B 4.88 £.318e-~ 
11 1009-1028 



phd- iinger. 



PHD- finger . 



HIV REV INTERACTING 
PROTEIN SIGNATURE 



PR00320 



PR00378 



PR00928 



DEAD -box subfamily ATP- 
dependent helicases 

proteins . 

G- PROTEIN BETA WD-40 

REPEAT SIGNATURE 



INOSITOL PHOSPHATASE 
SIGNATURE 



BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 



PF00628 15.84 3.045e- 
12 330-345 



PF00628 15.84 3.045e- 
12 377-392 



PR00405B 11.83 S.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 



BL00039D 21.67 5.219e- 
34 147-193 



PR00320C 13.01 8.920e- 
10 577-592 



GRAVES DISEASE CARRIER 



PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 



PR00928B 13.53 3 . 769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe^ 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2 . 500e- 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PRO 08 06 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 1.000e~ 
08 114-138 


1445 


PD01641 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 

\ 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14. 3S l.OOOe-40 144- 
185 PD01841D 17. 87 
l.OOOe-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 

/ A qqc nn(\i o a i r 
4U oi)3-yi4 JrJJUlo4J.Lt 

18.42 l.OOOe-40 1083- 
J.i<iD trUVJX a ^ JLJ5 lo.oU 
9.719e-38 258-296 

Onm DAI V 1 A Dl t AAn a 

35 1041-1071 PD01841H 
21 30 3 189e-3l ATS- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS his tone family. 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR00048 


C2H2-TYPB ZINC FINGER 
SIGNATURE 


PR0004BA 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BLO0O3O 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PFO0777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BIjO0927 


Trehalase proteins. 


BL00927C 10.83 8.085e-~ 
09 42-53 


"14£0 


BL0054S 


Aldose 1-epitnerase 
proteins . 


BL00545C 11.28 7.353e- 
17 169-182 BL0054SA 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


ANTHRANI1ATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.06'9e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 - ■ 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1477 


PF00566 


Probable rabGAP domain 
proteins . 


PF00566A 12.64 7.333e- 
10 466-476 


1478 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


GLIADIN . 


DM00406 7.73 8.*41e-10 
292-305 


1480 


BL0O290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385c- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PRO 0150 


PHOS PHOENOIiP YRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF00780 


Domain found in NIXl- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.82Se- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 1.153e~ 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protexn kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- [ 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3 . 700e- 
31 63-106 DL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein Kinases ATP- 
binding region proteins . 


BL00107B 13.31 1 . OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 


' Homeobox 1 domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


1 Homeobox * domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL00 972 


Obi qui tin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.5^5 5. 500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.759e- 
10 341-363 


1512 


BL00523 


Sultatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


Abl6 


BJL0060& 


Aminotransferases class- 
ril pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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apr^p e q TON 

wo. 












331 BL006C0G 12.43 
9.625e-l7 377-396 
BLO0600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.04Ce-l2 190- 
206 BL0O6C0F 8.77 
1.000e-ll 343-356 
BLO0600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD -40 

REPEAT STONATURE 


PRO0320B 12.19 4.774e- 

12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 

09 192-207 PRO 03 2 OA 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


1538 


DM01970 


0 kw ZK63 2.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0.60 4.508e- 
15 171-184 


1539 


PF00781 


Di a cyl glycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR0096SH 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 

1.123e-28 209-231 
rKUUsosL la . sj** J. . uuue — 
27 131-151 PRO0965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxys t erol - binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 1 . 000e- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins . 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15 .10 
1.000e-38 2-38 . 
BL00951B 14.23 6.2S0e- 
33 38-69 


1548 


BL00536 


Ubiqui tin- activating 
enzyme proteins. 


BL00536F 13.65 8.920e-* 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAGINASE/GLUTAMINASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


15S3 ■ " 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 S.119e- 
09 58-73 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6 . 276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins. 


12 107-132 


1562 


BL00522 


DNA Dolvmerase famTlv X 
proteins . 


DJjU KJ Z3 <5 £, \. 11 . 3U t> . OUU6- 

18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6 . 123e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTB {also known as BR- 
C/Ttk) domain proteins. 


PP00651 15.00 1.947e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL0X013 


Oxysterol- binding 
protein family proteins. 


BL01013D 26.81 8 . S94e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 3 ,400e-10 
j BL006 78 9.67 
5.800e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 


'1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR00665 - 


OYYTOr'TN PFfWDTTiP 

SIGNATURE 

■ 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 

rKUUDDOr JLJL. /J 4 . OOOe- 

22 337-354 PR00665C * 

PR00665B 5.29 4.337e- 
19 24-39 PRO0665E 
5.60 2.929e-15 246-260 
PR0066SA S.99 5.622e- 
15 11-25 


1577 


DM00O99 


4 kw A55R REDUCTASE 
TERMINAL 

D I H YDROPTE R I D I NE . 


DM00099B 14.73 9.3 08e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins. 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins. 


BL00411C 15.04 5.292e-'~ 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PR00604 


CLASS I A AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PFOOfeSl 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1S85 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1S86 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


UM01354S 11.61 7.750e- 
09 474-495 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
Jib rKUUU f 4L\i JLU.4b 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1 COG 


CUUUJL7X 


Cytochrome Jb5 family, 
heme -binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


1592 


BL00037 


Myb DNA- binding domain 
proteins repeat proteins 
proteins . 


BL00037B 15.92 3-250e- 
27 116-142 BL0 0037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BI,00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


"1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PP00628 


PHD-finger. 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 




r IBRONECriN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


16 00 


oiiu u r> x o 


Zinc finger, C3HC4 type 
(RING finger) , proteins. 


BL00518 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin {GAP- 43) 
proteins . 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


■pirn hr^i 

i -T UUOD1 


btb (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


1607 


OIIU v c J<i 


Interferon alpha, beta 
and delta family 
proteins. 


18.49 6.b57e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


"1611 


BL0 0904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168- 


C2 domain proteins. 


PF00168C 27.49 3,250e- 
09 365-391 


1613 


BL.004 12 


weuromouuiiii iuAc-4j < 
proteins . 


rJliOUal^D lb . 54 6 . 051e- 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxido reductases 
proteins . 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0C559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 

i 


PD01427 


TRANSFERASE 
METHYLTRANS FERASE BI . 


PD01427B 22.45 3.02Se- " 
22 500-541 PD01427A 
19.94 8.773e-18 439- j 
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NO. 
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RESULTS* 








472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.485e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


""BL00303B 2£.lS 7.7S0e- " 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8 . 754e- 
09 137-147 


1619 


PD01888 


I PEPTIDE REDUCTASE 
PROTEIN METHI. 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR0023 9E 1 58 3 4*5*5**- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l,000e- 
40 93-139 BL00325A 


1631 


BL00064 


L-lactate dehydrogenase 
proteins . 


BL00064B 23. SI l.OOOe- 
40 82-130 3L00064C 
A(.£o i . uuue-^u j. j / - 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


"1632 


PRO 00 63 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 1.105e- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM -POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR0O320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PRO0320A 16.74 
1.659e-09 279-294 
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NO. 
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RESULTS * 








PROO320A 16.74 2.098e- 
09 229-244 


1642 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosoraal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


1646 


PRO 03 80 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6. 30Be-18 386-408 
PR003 80C 13.18 7.923e- 
16 332-351 PR00380B 

, D4 O . OD /e-13 

310 


1647 


DM01242 


3 THREONINE"- -TRNA 
LIGASE . 


37 340-381 DM01242E 

505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 

x o .c o j — .9 x ** Un u^<4<r 

10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 6.720e- 
ll a^i _ 4 a £ 


1652 


BL00933 


FGGY family of 
carbohydrate kinases 
proteins , 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 

x j - o vj j . £i / C"v7 t jo" 

472 


1653 


BLO0795 


Involucrin proteins . 


BL007S5C! 17 06 5 aao P . 
10 70-115 


1654 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins . 


BLOODS 2 A 18 41 7 750*a- 
17 302-334 


16S5 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A IB. 41 7.750e- 
17 282-314 


1656 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS ORE 6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL009Y2ff"22.5g 4.140e- — 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Act ins proteins. 


BL00406D 12. S8 8.767e- 
1S 188-243 


1661 


PR00105 


CYTOSINE- SPECIFIC DNA 
METHYLTRANS FERASE 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.6"1 3.172e- 
33 3119-3163 


1663 


PR00319 " " 


BETA G- PROTEIN 
{ TRANSDUC IN ) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200^19 70-85 
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KO. 
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RESULTS* 


lob* 


BL00 018 


EF-hand calcium- binding 
domain proteins. 


BL0OO18 7.41 5.050e-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 




NOijl/NOP2/sun family 
proteins . 


BL01153D 19.69 1.188e~ 
17 115-141 BL01153C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 




PIJ K1NA&E P85 
REGULATORY SUB UN IT 
SIGNATURE 


PR00678H 9.13 3.1O0e- 
10 1146-1169 


1672 


BL00598 


Chrorao domain proteins . 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTP1/0BG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-358 PR30049D 
0.00 1.286e-10 342-357 


1676 


PR00747 


GLYCOS YL HYDROLASE 
FAMILY 4 7 SIGNATURE 


PR0074 7H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4 . 600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 

• 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 




Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL0067e 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.^C0o-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTP1/OBG GTP -BINDING 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PRO 064 6 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 




RIBOSOMAL PROTEIN* P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 

"i nC "7 OQ1 0-1 n A T Q At A 

PR00456E 3.06 8.125e- 
10 420-435 


1692 


PR0045(J 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.281e-10 488-S03 i 
PR00456E 3.06 8.125e- 
10 489-504 


31,00674 

1 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 

XO JjO" JOS ObUUO IHC, 

15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR0046^C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR004&6F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00026 16.07 5.500e- 
11 227-244 BL00028 
16.07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


trJJU ±U o o 


FK<J~.EIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR0D109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


fKOU\713 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR0OO19A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.6OOe-09 113- 
127 PR00019B 11.36 
/.12Qe-09 204-218 


1711 


BL01159 


WW/rsp5/WWP domain 


BL01159 13.85 6.523e- 
11 2J2-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 




Dirnnfma i a m *7 nnn« 
f r uuu^jh xb .uj / .uuue — 

10 187-203 


1713 


PF00642 


Zinc finger C— x8 — C— x5 — C— 
x3-H type (and similar) . 


f f wUOIZ 11. jOvc 

11 230-241 


1714 


PF00642 


Zinc finger Ox8-C-x5-C- 
x3-H type (and similar) 


PF00642 11.59 9.550e- 
11 230-541 


1715 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.013e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BIj00412 


Nfeuromodul in ( GAP - 4 3 I 
proteins. 


09 432-483 


1721 


BL00038 


Myc-type, "helix- loop- 
helix ' dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
13.61 4.000e-ll 52-68 


1723 


PD00S6 7 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 S.SOOe- i 
09 418-428 


1724 


BD01279 


Protein-L- 
isoaspartate (D- 
aspartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Xinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteir.3 . 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family . 


PF00850F 15.70 4.349e- 
22 246-279 PP00850D 
14.76 6.850e-20 177- 
201 PF00850E 8.88 
8-691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL003 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-S02 


1743 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 l.lBBe- . 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BIi00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 
15 136-160 


1746 


PR000 81 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BIi00439 


Acyl t rans f e rases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.43Se- 
14 6S-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e- j 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PbOOOf^ 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
130 


1753 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 6 . 516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e-~" 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.7S0e- ~ 
35 10-49 


1758 


DM004 06 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
trKcLURbOR I . 


PD02929A 28.27 4.529e- 
09 224-278 


1765 


PR0032 6 


GTP1/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


177 6 


BL00942 


glpT family of 
transporters proteins . 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 

as corba te - dependent 

monooxygenases proteins . 


BL00084D 25.11 3.700e- 
20 169-224 BLOO084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-15B 


1779 


BLO1013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.891e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL0 0741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine- nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype,- raw score; p- value; post ion of 
signature in amino acid sequence. 
TRADOCS: 14 1 6223J (%CRJ0l I.DOC) 



245 



WO 01/53312 



PCT/USOO/34263 



TABLE 4 



SEQ ID 
NO: 


. — 


DESCRIPTION 


p -value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2.1e-32 


109.5 


3 


pk i nss e 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 




omc ringer, C2H2 type 


1 . 6e-21 


84 .9 


5 


~~7n% " — 


Fibronectin type III domain 


0 


1097.1 


s 


fn3 


Fibronectin type III domain 


0 


1035.0 


7 




Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


a 

7 


TBC 


TBC domain 


4e-40 


14 6.7 


10 


p450 


Cytochrome P450 


9.5e-l7 


62.0 ~~ 


12 


ank 


Ank repeat 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


IS 


zf -MYND 


MYND finger ~ 


1.3e-06 


35.4 


16 


Zf -MYND 


MYND finger 


1.3e-06 


35.4 


17 


zf -C2H2 


Zinc finger, C2H2 type 


1.7e-99 


""343.9 


18 


CAP_GLY 


CAP-Gly domain 


1.2e-25 


98 .7 


20 


IMPDH_C 


IMP dehydrogenase J GMP 
reductase C terminus 


1.6e-119 


410.5 


21 


IMPDH__C 


IMP dehydrogenase / GMP 
reductase C terminus 


4.3e-102 


352.6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2.4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 .4e-74 


258.6 


25 


RNA_pol_A 


RNA polymerase alpha subunit 


0 


1077.7 


26 


Clq 


Clq domain 


1 ,9e-l0 


44.4 


27 


RibosomalJU2 
3 


Ribosomal protein L23 


7.8e-32 


111.2 


2 8 


Ribosomal_L.2 
3 


Ribosomal protein L23 


le-29 


104.2 


3 0 


zf -A20 


A20-like zinc finger 


1 . 5e-10 


48.5 


31 


Zf-A20 


A20-like zinc ringer 


1 .5e-10 


48.5 


32 


FMN dh 


FMN -dependent dehydrogenase 


5.4e-179 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PXD) 


3 . 8e-59 


209.9 


35 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


3 6 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


40 


kinesin 


Kinesin motor domain 


6 .7e-76 


265.6 


44 


EtS 


Ets-domain 


1 .4e-56 


182 .1 


45 


Ets 


Bts-domain 


1 .4e-56 


182.1 


46 


LRR 


Leucine Rich Repeat 


1. 7e-13 


58.3 


4 8 


zf -C2H2 


Zinc finger, C2H2 type 


2 .3e-162 


552.8 


4 9 


IT AM 

• 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 


50 


UGH -2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


51 


UCH-2 

— 


Ubiquitin carboxyl -terminal 
hydrolase family 


1 .le-26 


102.0 


52 




Ras family 


8.5e-45 


162.3 


53 ' 


PRK 


Phosphoribulokinase 


2.1e-65 


230.7 


54 


hi nrt ^ n rr 


Myb-like DNA-binding domain 


0. 096 


15.2 


55 " 


voltage_CLC 


Voltage gated chloride channels 






56 


sugar_tr 


Sugar (and other) transporter | 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi * 
n 


PMP-22/EMPyMP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7.9e-54 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 


ig 


Immunoglobulin domain 


3.2e-28 


94 .7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


— , y 

p-va ue 


PFAM 






domain 






74 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 8e-3 8 


X *± v , © 


76 


zt- 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5.4e-54 


192.8 


83 


Peptidase S9 


Prolyl oligopeptidase family 


4 .3e-10 


3^.8 


84 


fn3 


Fxbronectin type in domain 


4 . le- 51 


183 2 


86 


SH2 


Src homology domain 2 


3 . le-22 


""^7.7 


88 


19 


Immunoglobulin domain 


0 . 0091 




09 


WD40 


WD domain, G-beta repeat 


2 . le-2l 


84 . 6 


92 


laminin G 


Laminin g domain 


6 .le-27 


98.5 j 


93 


AMP-binding 


AMP-binding enzyme 


a * h e — _ j 


-37.2 | 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


"96 


pkinase " 


Eukaryotic protein kinase ~ 
domain 


^ . be-bl 


183 . 9 1 


97 


adh short 


short chain dehydrogenase 


2e-61 


217.5 | 


98 


kinesin 


Kinesin motor domain 


2 . 2e- 8 6 


300.4 | 


101 


IRS 


PTB domain (IRS-i hvnp \ 


5 . 4e - 3 6 


133 . 0 | 


102 


AAA 


ATPases associated with various 
cellular act 


6. 8e-05 


-5.2 


104 


pkinase 


Eukaryotic Drotein kinaej^ " — 
domain 


2 . 7e-73 


256.9 [ 


106 


ras 


Ras family 


8 . 3e- 24 


92.5 j 


107 


FYVE 


FYVE zinc finger 


5.4e-27 


100.7 j 


108 


Cy t_reduc t a s 
e 


FAD/NAD- fc>i ndi no p. l_ 

* ****/ ■* umuifiy v^ycocnrome 
reductase 


7. 7e-6Sl 


215.5 1 


109 


zf-C2H2 


Zinc fincrer. C2H2 h\m^ 


2 - 3e-122 


420.0 [ 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 H 


116 
"117 


PH 




3 . le-11 


45.2 1 




lipocalin 


u^wwaaaii / jr uvuujl ic Lac cy— 
acid binding pr 


2 . 4e-14 


53 .5 | 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4.5e-20 


76.3 "j 


120 


WD4 0 


WD domain, G-beta repeat 


2 . 4e-14 


61.1 j 


121 
~123 


WD4 0 


WD domain, G-beta repeat 


2 . 4e-14 


61.1 [ 




IF5 eIF4 elF 
2 


el F4 -gamma /elFS /eIF2-epsilon 




122.2 1 


124 




Immunoglobulin domain 


6 . 5e-08 


30.6 j 


127 


raito_carr 


Mitochondrial carrier proteins 


■Jo. 1 r 


58.6 | 


128 


PP2C 


Protein phosphatase 2C 


2.2e-71 


250.6 ~| 


129 


ATP1G1_PLM_M 
AT8 


ATP1G1/PLM/MAT8 family 


3 .le-20 


80.6 j 


130 


pfkB 


pfkB family carbohydrate kinase ' 


4.5e-42 


137.1 


133 


ACBP 


Acyl CoA binding protein 




86 . 7 | 


134 


rrra 


RNA recognition motif. 




118.5 | 


13 5 - 


IQ 


IQ calmodulin- binding motif 


2 . 6e - 08 


41.0 1 


13 6 


ATP1G1 PLM M 
ATS 


ATP1G1/PLM/MAT8 family 


^ . j e — *ga 


85 . 7 i 


139 ' 
140 


WH2 

zf-C2H2 


Wiskott Aldrich syndrome 
homology region 2 
Zinc finger, C2H2 type 


0 . 0067 




141 


Peptidase S2 

6 *" | 


Signal peptidase I 


1.7e-82 
5.7e-10 


287.5 j 
35\? 


143 


arf 1 " ■ - 


ADP-ribosylation factor family 


1.2e-39 


145.2 | 


146 


KRAB 


KRAB box 


7*36 -30 


112.6 j 




DUF6 


rntegral membrane protein DUF6 


0.096 


8.0 j 


149 


PDEase 


3 • 5 ' -cyclic nucleotide ~™ 
phosphodiesterase 


3:ee-8o 


231.1 


151 
153 


S4 

tRNA-synt_ld " " 


S4 domain 

crna synthetases class I (R) 


l.le-08 


32.3 


154 

~T55 ; 

157 i 


Cyt_reductas ] 
s 3 
ras i 
ictin ; 


?AD/NAD- binding Cytochrome 
reductase 

las family ; 
Ictin 


3.8e-103 

7.8e-60 : 

J.6e-28 j 
1.8e-26 £ 


356.1 
212.2 H 

107.0 ] 
17.1 [ 
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SEQ ID 
NO : 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


158 


Jacalin 


Jacal in-like lectin domain 


0.09 


-24.9 


16 0 


Zn__ca rbOpep t 


Zinc carboxypeptidase 


5e-138 


471 . 9 


165 


pkinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236.1 


• T C 1 
J.O / 




Zinc ringer, C3HC4 type (RING 
ringer / 


5 . 3e- 07 


27 . 0 


16 8 


5 


Ribosomal protein S15 


x - le- ub 


29.0 


J. © J 


DEAD 


u£AD/D£AH oox nelicace 


le-48 


157 . 0 


J. /JL 




Domain of unknown function 


0 . 07 


-17 . 4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58.6 


173 


. . , 

globin 


UrJ.OOj.Zl 


4 . 6e-18 


67 .4 


174 


WW 


WW domain 


7.3e-06 


32.9 


175 


ras 


Ras family 


le-31 


118 . 8 


178 


ATP1G1_PLM M 
AT8 


ATP1G1/PLM/MAT8 family 


2 .5e-17 


71 .0 


173 


zf -C2H2 


Zinc finger, C2H2 type 


1.5e-99 


3 44 . 2 


180 


Clq 


Clq domain 


8 . 8e-72 


251 . 9 


190 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


4.9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285.6 


194 


bromodomain 


Bromodomain 


5.8e-31 


111 .4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.5e-6-4 


227.1 


197 


DnaJ 


DnaJ domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


200 


acid__phospna 
t 


Histidine acid phosphatase 


2.5e-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


2 04 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


1.3e-159 


543.7 
• 




VATF- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1 . 6e-139 


476 .9 


206 


j.ai__recept a 


Low-density lipoprotein 
receptor domain 


2 .4e-25 


97 .6 


209 


anx 


Ank repeat 


l ,4e-i9 


78.4 


"210 


Rhomboid 


Knomuoia ramily 


0 . 0035 


1.2 


211 


i-iq 


Clq domain 


1 . 6e-70 


247 .7 


212 




UJbi qui tin -conjugating enzyme 


7 . 4e-74 


258 . 8 


213 


UQ_con 


Ubiqui tin -conjugating enzyme 


le-53 


191.9 


215 


DEAD 


utiiwj/ ah dox nei lease 


1 . 8e-43 


140.4 


216 


PMP2 2_Claudi 

n 1 


PMP-22/EMP/MP20/claudin family 


4.5e-21 


83.4 


218 


viy^vo trans 
f 2 ~~ 


, ■ 

Glycosyl transferases 


4e-2l 


83 . 6 


219 


ia 


XmiilUill*j JLUJUU1 JLil CLOuiaiXl 


0 . 092 


10.7 


222 


WD4 0 


vtu <j u) 1 1 ici in , (j-oeca repeat 


7 . 4e-23 


89 . 4 


224 


TPR 




1 . 2e-08 


42 . 1 


225 * ' 


DnaJ HXxf Y«Y — 

o 


DnaJ central domain (4 repeats) 


1 . 5e-38 


141 .5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


l.Se-38 


141.5 


229 


HSP70 


Hsp70 protein 


2 .4e-54 


194.0 j 


230 


GSHPx 


Glutathione peroxidases 


3 ,4e-47 


170.2 


231 


tsp_l 


Thrombospondin type 1 domain 


0.0075 


17.1 


233 


eye 1 in 


Cyclin 


4 . 6e-144 


492 . 0 


234 


ras 


Ras family 


4.8e-S0 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6.7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 




1/QoLHIPTION 


1 p-value 


PFAM 
SCORE 


244 


m 


Cytidine and deoxycy tidy late 

dQa.tti'i na.qp 


2 . 5e-05 


31 .1 


245 


*g 


Immunoglobulin domain 


6.7e-08 


30.5 


248 


writ 


signaling protei 


9 . le-270 


742 . 6 


250 


initio carr 


LULIIUllHL Idl L (1 t I J. C i prOtCAOS 


j 1.3e-5Ej 


193 . 6 


254 


a deny 1 a t ek i n 
aee 




j 1.8e-14 


55 . 7 


255 


I. JL -1 w ^ ^ JL U 

x 


udtrion erriux xamny 


2 . 8e-33 


124 . 0 


256 


SH3 


SH3 domain 


1 3.9e-14 


60.4 


257 


l» JU C* AlO 


Transmembrane amino acid 


2.6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380 .2 


259 


HIT 


til. ± l. 3.1X12. ±y 


| 8.2e-07 


25.3 


260 


Bacterial_PQ 
Q 


PQQ enzyme repeat 


1.6e-15 


65.0 


262 




Proteasome A- type and B - type 


J 6 . 5e-64 


225.7 


267 


ok i na «^<= 


Eukaryotic protein kinase 

{A Amp -t f) 


6 . 3e-27 


101 . 0 


270 


f i lament 


laLe^iucQjioico x amen c proteins 


J 3 . 2e-150 


512 . 5 


271 


Choi ine_kina 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosoraal S7 


Ribosomal protein S7p/S5e 


j 3 .3e-20 


80.6 


279 




Eukaryotic protein kinase 
domain 


3 . 3e-77 


269.9 


280 


WD40 


WD domain , G-beta repeat 


1 7 . 8e-73 


255 .4 


281 


WD40 


WD domain, G-beta repeat 


| 7 . 8e-73 


255.4 


284 


zf-DHHf 


mine zinc tiJiger domain 


i 4 .6e-24 


93 .4 


287 


Exonuc lease 


Exonuclease 


i 1.4e-67 


238.0 


291 


SAM " " 


bAM domain (Sterile alpha 
motif) 


| 0.034 


11.2 


292 


SAM 


i>AF\ domain (Sterile alpha 
motif ) 


0 . 034 


11.2 


294 


zf -C2H2 


«xiic miiger, i^Anz type 


1 -4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 .2e-125 


430.0 


296 


ui j. *- v v«x i. 


Mitochondrial carrier proteins j 


4 . le-59 


205.5 


297 


HMG_box 


HMG (high mobility group) box \ 


6.7e-29 


109.4 


3 02 


uiytuy trans 
f 4 


Glycosyl transferase "1 


5e-87 


302. £ 


304 


tRNA-synt 2 


ckjna syntnetases class II (D, K 
and N) 


1 . le-84 


294.8 


305 


KRAB 




2e - 44 


161 . 0 


3 06 


rrm 


RNA reroani M nn mr»t- *i ■f i 


2 • 7e-44 


160 . 6 


308 


7tm_l 


/ transmembrane receptor j 
(rhodopsin family) j 


b . ^e- 


126 . 1 


309 


DNA_p ol yme ra 
seX 




2 . 4e-64 


227 . 2 


311 


F-box 


F-box domain . 


9 . 5e~ 08 


39.2 


312 


ig 


Immunoglobulin domain ! 


~F — pT — TfH 

b . oe- iy 


65.9 


313 


Ets 


Ets -domain" I 


8 * le-60 


192.3 


315 


Kelch 


Kelch motif j 


1 . 3e-l06 


367.6 


317 


arf ■ " 


ADP-ribosvlat , lnri -f 1 = rt~ r*v f am{ 1 I 
^ iiMvo y ^ u v Xvji iuuL <j l- ±. d.m jl y i 


3 - 2e-35 


130 . 4 


318 


sugar_tr 


Sugar (and other) transporter j 


0.0003 


-73.1 


320 


pkinase 


Etu^dryotic protein Kinase f 
domain 


8 . le-83 


288 . 6 


322 


pkinase 


Eukaryotic protein kinase | 
domain | 


4 . 9e-81 


282,6 '"" 


324 


Xlink 


Extracellular link domain | 


4.5e-143 


331.5 


326 


ARID 


ARID DNA binding domain j 


S.le-37 


136.4 


327 


HMG box 


HMG (high mobility group) box j 


6.7e-29 


109 .4 


328 


cadherin 


Cadherin domain j 


B.ie-8l 


281.9 


331 


chromo 


'chromo 1 (CHRromatin I 
Organization Modifier) [ 


4e-18 


66.7 


333 


Peptidase M2 
2 


3lycopro tease tamily 1 


I.2e-136 


467.4 
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NO: 




DL SCRIPT I ON 


p- value 


PFAM 
SCORE 


"335 


vwa 


von Willebrand factor type A 
domain 


2 . 3e-07 


37.9 


339 




Li —% cr» f~ am i 1 if 

i\o5 XTcintaxy 


7 . 8e-07 


-59 - 1 


340 


zf -C2H2 


uinc linger, \,zti& type 


8 . 2e-64 


225 .4 


342 


~ zf-C2H2 


tine tJLiiyet, \*,zt\.£ cype 


2 .4e-85 


297 . 0 


343 




imtnunoyiyDuiin Qom&jLn 


0 . 0005 


18 . 0 


346 




tuicaryotic procein Kinase 


6 . 5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 


6.5e-65 


229.1 


351 


EGF 




8 . 5e-20 


79. 2 


352 


ank 




2 . 5e-101 


350.0 | 


354 


TBC 


't'fln Hrwn.ai n 
*PU UDIlldJ.Il 


5 . le-lS 


63 . 3 


355 


PHD 


DUn_ {* i nrror- 

trrxU — L lnget 


3 . 2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0 . 033 


15.8 


359 


Zf -C2H2 


«inc linger, type 


7 . 4e-20 


79. 4 


361 




Ank repeat 


6 . 6e-34 


126 .1 


362 


rti ludu 


Putative GTP-ase activating 
protein for Arf 


4 .7e-53 


189 .7 


363 


6 f hand 


EF hand 


5 .4e-10 


46.6 


367 


LRR 


Leucine Rich Repeat 


8 . 8e-44 


158 .9 


368 


laminin G 


Laminin G domain 


1 .5e-33 


121.7 


3 69" 


PP2C ""' 


Protein phosphatase 2C 


5 .3e-20 


73 . 9 


3 72 


LIM 


LIM domain containing proteins 


9 .9e-15 


57. 1 


373 


KRAB 


-KRAB COX 


4 .8e-23 


90.0 . 


3 76 




Ion transport protein 


2.9e-09 


-4 . 2 


377 


BS3CH 


Beige /BEACH domain 


4 .9e-208 


704 .5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-94 


327.5 


381 


AMP- hi 4 nr^inn 


AMP-binding enzyme 


1.4e-07 


-140.3 


382 


HECT 

. 


HECT-domain (ubiquitin- 
transferase) . 


1.3e-07 


-13 .5 


384 




Ank repeat 


2.5e-101 


350.0 


386 




Immunoglobulin domain 


9 .5e-06 


23 .6 


388 




Zinc finger, C2H2 type 


1.7e-42 


154 .6 


389 


A 9 


Immunoglobulin domain 


2 .8e-15 


54.3 




•in to can 


Mitochondrial carrier proteins 


3 ,5e-67 


233 .2 


392 


TPR 


TPR Domain 


6.1e-17 


69 . 7 


393 


SH3 


oru domain 


3 . 5e-09 


43.9 


3 94 


AAA 


ATPases associated with various 

OPT 1 111 ay aiih 


4 .le-21 


83 .6 


396 


SDSCtri r» 


Spectrin repeat 


2 . le-67 


237.3 


397 


z£-C2H2 


-t-j.il- iiiiyci^ i, c^txd. type 


0 . 0066 


23 . 1 


3 99 


fn3 


Fibronectin type III domain 


4 .le-102 


352.6 


400 


WD40 


WD domain, G— beta repeat 


0 . 00049 


26 . 8 


401 


El dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


4 02 


£n3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


ueucine Kicn xepeac 


2 . le-10 


48 . 0 


405 " " ■ 


cadhp t i n 


Cadherin domain 


8 .le-81 


281 . 9 


406 


Zf-CXXC 




5e-15 


63 . 4 


410 


RhoGBF " " " 


miUVJQC UUIlla -til 


1 . le-23 


92 .1 


411 


F-box 




4 .2e- 06 


33 . 7 


412 


SNF2_N 


SNF2 and others N- terminal 

VHJI1ICL.L.U 


5.8e-16 


61 .6 


415 


CPSa s e_L__cha 
in 


Carbamoyl -phosphate synthase 

f PDCa oo^ 

i^iroase/ 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 . 8e-24 


93 .6 


419 


DENN 


denn (aex-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8 .le-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523 .7 


424 


G -patch 


G-patch domain 


le-19 


78 .9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2 .2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24.6 


427 


Plexin_repea 


Plexin repeat 


>.0023 


24.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 




C 








429 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8.6e-n 


39.2 


431 


DEAD 


dead/deah box heiicase 


le-66 


214.0 "■ 


432 


SH3 


SH3 domain 


1 Ao t a 


67.2 


433 


GTP CDC 


Cell division protein 


2.1e-114 


393.5 


436 


Collagen 


uuxxa^jcn fctipic .iciiA repeac 
(20 copies) 


4 . 6e-194 


658 . 1 


438 


Ricin B lect 
in 


oiuaadnty to leccin domain or 
ricin b 


0 . 0085 


10.5 


441 


Alpha adapt i 
n C 


vat wAy x - utsi (nxfiai 

domai 


1 . 2e-256 


866. 0 


442 


Alpha adapt i 
n_C 


Alpha adapt in carboxyl- terminal 
domai 


1.8e-235 


795.7 


443 


PD2 


PDZ domain fJ\1 «n ]enn^ir\ »a ntro 
f uo v-rtx&u isinown as Unix 

or GLGF) . 


1 . 9e-65 


230 .9 


445 


LON 


domain 


0 . 00012 


-17 .1 


446 




Immunoglobulin domain 


n rt rt m n 


20.1 


451 


sushi 


Sushi domain (SCR repeat) 


l.4e-18 


75.2 


452 


£n3 


Fibrnnprf H n fr- \/r-\^ ttt /inmai « 
* i-mcu o Jin Lyuc xxx Qonict in 


1 . 5e-06 


35.2 


454 


pyridoxal de 

c 


c i *■ wyAQi - ucpenaenc 
de c a rboxvl a mncp 


8 . 3e-14 


50 .3 


456 


kinesin 


Kinesin motor domain 


4 . 9S-217 


734.4 


457 


neur_chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597.1 


458 


Josephin 




0 . 0002 


18 . 7 


468 


bZIP 


bZIP transcription factor 


1 .7e-07 


31.8 


470 


NTP_transf er 
ase 


Nucleotidyl transferase 


6.3e-06 


-26.3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing proteins 


0 .00021 


20 .7 


477 
479 


zt-RanBP 
WD4 0 


Zn- finger m Ran binding 

protein and others. 

WD domain, G-beta repeat 


0.02 8 


21.0 


480 


kRAB 


KRAB box 


6 .5e-18 
le-31 


73 .0 
118.8 


481 

j 485 


ArfGap 
SH2 


Putative GTP-ase activating 
protein for Arf 

«■*•*- ^ iit>mt/x ogy aomain £. \ 


B.4e-66 
0 .011 


232.0 
11 .4 


486 
487 

489 


Clq 
dsrm 

zf-C2H2 


Clq domain 

Double- stranded RNA binding 
motif 


4 .3e-74 
l.le-47 


259.6 
171.9 


490 
492 


Alpha_adapti 
n C 

"SKI 


^.A-iAyc* / ^^.nz. type 
Alpha adaptin carooxyl- terminal 
domai 

ShiJcimate kinase " 


4 . 8e-153 
3.4e-222 


521 . 9 
751. 6 1 " 


497 


ENVjpolyprot 
ein 


ENV polvnrotein / coa«- " 
polyprotein) 


1.2e-10 
2 • 6 e - 2 2 


48 . 8 

77.6 


498 

500 
501 


abhydrolase_ 
2 

rrm 
WW 


Phospholipase/Carboxyl est erase 

RNA recognition motif. 
WW domain 


0.041 

5.4e-34 
4 . oe — x a 


-48.1 

126.4 
73 . 4 


502 
504 
505 


*3 

abhydrolase 
vwa 


Immunoglobulin domain 

von Willebrand factor type A 
domain 


l.ie-io 

0 . 045 
7.1e-62 


39.5 
-3 . S 
219.0 


508 
509 


Na K ATPase 
C 

Exonuc lease 


Na+/K+ ATPase C- terminus 
Exonuclease 


2.3e-145 


496.3 — 


510 


Glycos trans 
f l 


Glycosyl transferases group l 


1.3e-56 
2.9e-06 


201.5 
27.0 


511 


Glycos trans 
f 1 


31ycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans < 
f_l 


Slycosyi transferases group 1 


i.9e-09 


38.5 


514 j 

t 


pro isomeras ( 

* 


^yclophilin type peptidyl- 1.8e-63 1 
Prolyl cis-tr ) 


221.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p— value 


1 PFAM 

1 SPOBP 
l oi»vnCi 


515 


EGF 


EGF-like domain 


1 . 9e- 18 


[ 74 _ 7 


516 


Surp 


Surp module 


4 . 3e-38 


I 140 . 0 


523 


19 


Immunoglobulin domain 


3.3e-06 


j 25.0 


526 


UBX 


UBX domain 


1 . le-34 


12 8 . 6 


528 


adh_zinc 


Zinc -binding dehydrogenases 


2,7e-34 


"127 .4" 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


j -34.1 


532 


ml to carr 


Mitochondrial carrier proteins 


2.5e-81- 


1 281 . *i 


533 


mito carr 


Mitochondrial carrier proteins 


2e-6l 


1 213. S 


534 


thiolase 


Thiolase 


O . DC" iDJ 


1 622 . 0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153.7 


53 6 


SCAN 


SCAN domain 




j 196 . 6 


53 7 


tRNA-synt 1 


M and V) 


3 . le-13b 


j 466.0 


53 8 


tRNA-synt 1 


M and V) 


3 . le-136 


1 466 . 0 


539 


tRNA-synt 1 


fcRNZl Qvnhhpl*aaac ni aon t ft r 
LJU '« tiyjiLUcLdatia CXaSS X IX, J_i , 

M and V) 


1 . 9e- 117 


403 . 6 


540 


tRNA-synt l 


M and V) 


3 . le- 136 


466 . 0 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-B5 


[ 295.7 


543 


zf -C2H2 


cjj. iil, l j. uy tsi » riz cype 


5 . 5e-c9 


J 242 . 6 


'544 


DUF101 


Protein of unknown function 


8.5e-38 


139.0 


545 


TGFb_jpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD40 


nu Livjuta ill , \j US _ a. iSpSaC 


2 . Se-32 


| 120 . 8 


548 


RHD 


k-ka iiomoxogy aotnalu ^ KrlU ) . 


■ 1 . 6e~238 


1 686 .2 


549 


MMR HSR1 


GTPase of unknown function 


5.4e-67 


236.0 


551 


HECT 


ticux -uomaxn ^udx qui t in- 

f" TTFi ncf o r- ^ aa\ 


4 . 3e-127 


435 . 6 


554 


MHC II alpha 


PI aee TT hf* o t - nrnmna i Ki 1 -i 

^"oo ii [iibLuconipdLiDiii.cy 
antigen, alp 


3 . 5e-74 


259 . 8 


555 


zf-UBRl 


Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-2S 


109.7 


561 


AMP -binding 1 




2 . 8e-06 


-163 . 7 


562 


PABP 


unique domai 


4 . 9e-38 


139 . 8 


564 - " 


*Gag_p3 0 — — 


Gag P30 core shell protein 


x • ze — d / i 


238.2 


566 


PWWP 


PWWP domain 


ft i o_ i a 


66 . 0 


567 


SCAN 


SCANT domain 


l . jc-oo 1 


23 8.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 5e-84 




570 


pkinase 


Eukaryotic protein kinase 
domain 


l.Se-84 1 




571 


CN_hydrolase 


Carbon- nitrogen hydrolase 


0.00081 j 


-79.7 


572 


myosin_head 


Myosin head (motor domain) 


0 I 


14 95.2 


573 


myosin_head 


Myosin head (motor domain) 


0 j 


14 90.4 


575 


Surp 


Surp module 


1.7e-23 


91 . S 


576 


Surp 


Surp module 


1.7e-23 I 


91 ^ 
?r x . 3 


577 


DNA_pol B 


DNA polymerase family B 


0 " " 


11jO,D 


578 ■ 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


8 . 3e-09 t 




579 


LRR 


Leucine Rich Repeat 


4.9e-21 | 


83.3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


S.9e-177 I" 


601.3 


583 


sushi 


Sushi domain (SCR repeat) 


0 | 


1673.0 


584 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 1 


116.3 


586 


KH-domain j 


KH domain 


2.9e-13 [ 


57.5 ! 


587 


G-patch 


G -patch domain 


2.3e-14 \ 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 j 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 ! 


114 . 7 


591 


bromodomain 


Bromodomain 


5.6e-32 J 


114.7 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


592 


hormone_rec 


Ligand- binding domain of 
nuclear hormone 


3 .se-22 


87.1 


593 


" PHD 


PHD- finger 


3.8e-12 


53.8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342.7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319.2 


597 


WD4 0 


WD domain, G-beta repeat 


0 .00054 


26 .7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2.3e-86 


300.4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 . 


608 


PWWP 


PWWP domain 


2.6e-28 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP_GLY 


CAP-Gly domain 


0.0046 


20.1 


615 


RFX_DNA_bind 
ing 


RFX DNA- binding domain 


S.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


1. le-81 


284.8 


617 


kinesin 


Kinesin motor domain 


8 .4e-80 


278.5 


618 


2f -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13.1 


620 


MATH 


MATH domain 


7.8e-05 


22 .2 


621 


Y_p h osp ha t a s 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 ,4e-40 


146.6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


raolybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1.4e-12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleotide -binding 
domain 


3 .7e-58 


206.6 


630 


adh short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf-C2H2 


Zinc finger, C2H2 type 


2 . le-88 


307.1 


632 


rrm 


RNA recognition motif . 


4e-0S 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork_head 


Fork head domain 


5. 9e-27 


103 . 0 j 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 8e-70 


246.5 


642 


TPR 


TPR Domain 


4.8e-08 


40.1 


643 


efhand 


EF hand 


1.9e-27 


104 .6 


647 


SNF2_N 


SNF2 and others N- terminal 
domain 


1.2e-10l 


351.1 


648 


PseudoU_synt 
h 2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


ODO 




Zinc finger, C2H2 type 


0.0087 


22.7 


651 


ank 


Ank repeat 


1.3e-17 


71.9 






I/LWEQ domain 


9.5e-101 


341.0 


653 


neur^chan 


Neurotransmitter-gated ion- 
channel 


4 .le-171 


581.8 


a c a 


tsp_i 


Thrombospondin type 1 domain 


4 . le-47 


169.9 


c q 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


box 


pou 


POU domain - N- terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


74.2 


664 


C2 


C2 domain 


6.7e-19 


76.2 1 


667 


GST 


Glutathione S-transf erases . 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-31 


1X5,6 


670 


spectrin 


Spectrin repeat 


46-57 


203 .2 


671 


I_LWEQ 


i/LWEQ domain 


9.5e-101 


341.0 


472 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 | WD domain, G-beta repeat 


4.8e-24 


93.3 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


| PFAM 
J SCORE 


675 


WD40 




4 . 8e -24 


1 93 . 3 


676 


LRR 


Leucine Rich Repeat 




] 25 . 2 


679 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3 -H 
type 


2.6e-29 ■ 


107.7 


660 


zf-C2H2 


Zinc finger, C2H2 type 


5 . 2e- 05 


1 ^ n n 


681 


" CH 


Calponin homology (CH) domain 


" 2.4e-17 


71.1 


682 


~ DSPc 


Dual specificity phosphatase, 
catalytic doma 


t . Je — H J 


1 Ibb.b 
— j 


683 


" zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 




j 10.8 


687 


Synapsin 


Synapsin 


o 


1890 8* 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


o 


1 103 8 . 8 


691 


homeobox 


Homeobox domain ~ " 


8 . 5e-3 0 


[ 1 1 "*> A 


696 


Peptidase M2 
4 


metallopeptidase family M24 


2.6e-59 


~t 9i n c 


697 


RhoGEF 


RhoGEF domain 


9 . 5e-35 


I 128 9 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5 . 5e—123 


[ 422 , 0 


702 


Sulf atase 


Sulfatase 


1 e> — Til 


1 781.6 


703 


zf-C2H2 


Zinc ringer, C2H2 type 


5 . 7e— 20 


1 79.8 


707 


Acyl_transf 


Acyl transferase domain 




1 8 8.8 


708 


WD4 0 


WD domain, G-beta repeat 


4 . 8e-19 


j 76. 7 


710 


Ran__BPl 


RanBPi domain . 


8 . 4e-06 


) -7.3 


713 


DEAD 


DKAD/DEAH box kelicase 


9 . 9e-42 


1 134.9 


714 


PH 


PH doma i r» 


1 . 6e-09 


1 39 . 0 


715 


DSPc 


catalvtir rlom« 

w^u^AJr UUUtu 


1 . 5e-37 


T 138 .2 


717 


Sialyitransf 


Sialvltransferaw fam-ilv 

w*u4,jrj.biail9l.Q^a&C 1_ dill J. y 


7 . 5e-31 


r us . 9 


718 


lg 


iTtiiminoolobii "L d yi Homa i n 

— iiuiimiujjj.uwui JLli UUUlaXri 


le- 29 


| 100.8 


719 


integrxn B 


Intecrifin **« hf»t"a «-^Vi» -i -n 


0 


|_1125 .4 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
f inger ) 


l.le-08 


| 32.4 


722 


Peptidase__C2 


Caloain familv evstp'i'np 
protease 


3e- 145 


495.9 


723 






2 . 2e- OS 


22.4 


724 


F-box 


F-box domain. 


0 .007 


23.0 


725 


Nop 


rmrauivc ouuKivl O.JlQin^ uOTIiain 


8 . le-58 


205.5 


726 


Nop 


Putative <?n oRTJZJ nHS Tt/-r H/Mn-i -; »-> 

* ui.Bi>A«c ojivjrvtXA U-iiuxnu QOIUaXIt 


8 . le-58 J 


205.5 


727 


WD40 


WD domain G-t"«"*t*a ypnoa 


7.5e-26 | 


99. 3 


730 


dsrm 


Double-fltrandpn dma h-inH-i n/v 
motif 


0 . 027 j 


12 . 1 


"731 


dynamin 


Dynamin family 


A *> a — 1 c 1 


66 . 9 


733 


zt-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.8e-10 


41.7 


735 


CDP- 

OH_P_transf 


~CDP- alcohol ~~ 

phosphatidyl transferase 


4 . 2e-26 f 


1UU . 1 


73 8 


DEAD 


DEAD/DEAH box helicase 


8 . 6e- 57 j 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


6.5e-32 1 


119. £ 


742 


ras ! 


Ras family 


2 26-10O f 




743 


PMl_typeI 


Phosphomannosc i some rase type I 


1.2e-243 j 


822.9 


747 


trypsin 


Trypsin 


6 . 4 e - 8 8 1 


279 . 4 


748 " 


kazal 


inhibitor domain 






749 


erhand 


EF hand 


6 . 3e- 06 * 


J J . 1 


751 


PHD 


PHD- finger 


4.9e-16 


66 .7 


752 


z±-C2H2 


Zmc finger, C2H2 type 


3.2e-21 | 


83 .9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754 ■"' '-- 


Ribosomal 1,3 
9 


Ribosomal L39 protein 


0.00018 


26 .7 


755 


PH * 


PH domain 


3.6e-14 | 


55.7 


758 


SCAN 


scan domain 


1.4e-53 | 


191 .5 


759 


PA 


t-*A domain 


0.0065 1 


23.1 


760 
761 


arf i 
3IDE-N 


^P-nbosylation factor family 
-IDE-N domain 


2.2e-l9 I 
2.2e-40 1 


77.8 
147.6 
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SEQ ID 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


IK? 


Als tone 


Core histone H2A/H2B/H3/H4 


9.9e-53 


188.6 


/ O .> 


Zl-MiND 


MYND finger 


4.1e-l4 


60.3 


/oft 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 

\ 


188.6 


767 


vwc 


von Willebrand factor type C 


2 . 9e-34 


127 .3 


769 


erhand 


EF hand 


4 .8e-ll 


50.1 


770 


zf -C4 


«inc linger, v».«i type icwo 


2 . 4e-53 


181.6 " 


772 


ras 




7e- 90 


312 . 0 


773 


Sulf afcase 

\+A dk Gl I. CISC 


ouitdcase 


le- 142 


487 . 5 


775 


Zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


F — P9UO 


Zinc finger, C2H2 type 


1 .le-12 


55.5 


777 




Zinc finger, C2H2 type 


1 . le-12 


55.5 


778 




RNA recognition motif. 


2 . le-32 


121.1 


i 


/"< c r>r\ 


Glucose- 6 -phosphate 
dehydrogenase 


1.5e-76 


236.6 


/ o u 


spectrin 


Spectrin repeat 


3 . 7e-29 


110.3 


TOT 

/ox 


mito carr 


Mitochondrial carrier proteins 


4 .6e-57 


198.5 


782 




SCAN domain 


1.3e-24 


95.2 


783 




PDZ domain (Also known as DHR 
or GlrGFJ » 


4 .le-07 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21. 7 


78 6 




Ras family 


5 .3e-39 


143.0 


" 797 " 


KJNaee HXI 


Ribonuclcase HII 


2.5e-67 


237.1 


790 


riJ ri4 xina 

36 


Phosphatidylinositol 3- and 4- 
kinases 1 


5.4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147.4 


796 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


TOT 


trypsin 


Trypsin 


9.9e-20 


64 . 8 


TOO ' 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63 . 8 


oux 


Gal- 


Vertebrate galactoside-binding 
lectin 


4.1e-25 


B8 . 7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


"26.1 


806 




TBC domain 


1 .8e-26 


101.4 


807 


TBC 


TBC domain 


1.8e-26 


101.4 


808 


CN_hydrolase 


Carbon- nitrogen hydrolase 


8.8e-80 


"278.* - 


Oil 


i-BbU . NVY3 HM 
P 


Hi st one- like transcription 
factor 


£e-14 


59. 8 


812 


adh_shorfc 


short chain dehydrogenase 


8 .le-20 


79 .3 


814 




Domain of unknown function 


3.3e-71 


250.0 


815 


z£-C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


816 


Pep t __t RNA_ny 


Pep t idyl -t RNA hydrolase 


1.6e-37 


138 .0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74 .3 


826 


xra ej.j?*i ear 
2 


el F4 - gamma / e I F5/ e I F2 - epsi 1 on 


1.6e-32 


121.5 


"830 




Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191 .3 


831 


LRR 


Leucine Rich Repeat 


2.1e-26 


101 .1 


832 




Laminin EGF-like {Domains III 
and V) 


2e-57 


204 .2 


839 




rna recognition motir. 


1.3e-22 


88.5 


840 


Y_phosphatas 


Protean- tyrosine phosphatase 


2.6G-119 


409!8 


841 




Eukaryotic protein kinase 
domain 


3 .4e-100 


346.3 


844 


HiDosoraal Ii2 
2e 


Ribosomai L22e protein family 




228.4 


846 


I BR 


I BR domain 


9e-lS 


62.5 


849 


z£-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


zr-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113.2 


B52 


SRCR 


Scavenger receptor cysteine - 


0 


1025 .4 
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SEQ ID 
NO: 




L/BOt-K J. Jr llON 


p-value 


PFAM 








SCORE 












853 


SRCR 


oucjveuyer recepcor cysceine- 
rich domain 


0 


1025 . 4 


857 


lactamase B 


Metallo-hefa -1 arr»ma <a*» 

iicba>LAu uwLa xauLaillaoc 

supe r f ami ly 




-6.0 


B58 


COX6A 


Cytochrome c oxidase subunit 
Via 


3.4e-58 


206.7 


859 


rrm 


RNA recognition motif. 


5.4e-45 


162.9 


861 


PRK 


Phosohori Yi\ 1 1 olr i n»Qp 




219 . 4 


863 


mito carr 


Mitochondrial carrier proteins 


2.9e-53 


185.5 


864 


HSP90 




4 . 7e-158 


538 .5 


866 




Immunoglobulin domain 


4e-12 


44.1 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


h i ^t" on p* 


Core histone H2A/H2B/H3/H4 


4 . 9e-41 


149.8 


874 


>-raabc Xi CX1S 

in 


Carbamoyl -phosphate synthase 


2.1e-218 


739.0 | 


879 


Ribosomal_Sl 
2e 


Ribosomal protein S12e 


2.ie-98 


340.3 


882 


serpin 


Serpins (serine protease 
i rmioi c ors j 


2.5e-42 


145.7 


883 


Patat in 


| Patatin 


1 . 2e-51 


182 . 0 


884 


RA 


Ras association (RalGDS/AF-6) 

uvnnain 


0 . 044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2 . 7e-12 


54 . 3 


889 




cj u'-j z \ana ocaerj transporter 


8 . 2e- 63 


222 .1 


893 


DUF28 


Domain of unknown function 
DUF28 


1 .3e-43 


158.3 


896 


IP trans 


rnobpnatiayi inositol transfer 

^x, tree xii 


6 . 5e-98 


338 . 7 


898 


DEAD 


DEAD /DE AH hox Vi^l i mhp 


1 . 5e-4 8 


156 . 5 


899 


KB 2 




7e-61 


215 . 7 


900 


KE2 


KE2 family protein 


4.3e-51 


183.2 


901 


zf -C2H2 


*jj-i*t linger, t^riz cype 


2 . 7e-57 


203 . 8 


902 


ras 


Pae f ami 1 1/ 


2 . 3e~75 


263.8 i 


904 


TPR 


x f«, uumalli 


3 . 2e-22 


87. 2 


906 


GBP 


ouanyj.dte-oinaing protein 


8.9e-253 


853 .1 


907 


GBP 


Guanylate-binding protein 


1 .le-239 


809.6 


908 


WD40 


ww uoiuain, t»-oeta repeat 


2 . Se-26 


100 .8 


909 


PH 


jtcx uuuia in 


1 .3e-09 


39.4 | 


910 


zf -C2H2 


iiiiger, \*jzti£ type 


2 . 5e-39 


144 . 1 


913 


Epimerasc 


NATO rf^l^An 

iM>vtr aepenacTiL 

epimerase /dehydratase family 


5e-07 


-88.5 


921 


TBC 


iov< uutnain 


1 . 5e-09 


30.7 


922 


WD40 


WD domain, G-beta repeat 


'l.6"e-25 


98.2 


923 


WD4 0 


riu uoindin, u-oeca repeat 


8 . 2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hvdrola 


2.9e-05 


29.1 


925 


UQ con 


flhl m > i t" i n ~ pnrn nrra 1" i Tior on rm 
^ ^* -*- u xii i»uii ju^aLXlivi CT1ZV1II6 


U . 00033 


-27 . 6 


92* 


CH 


caiponin homology (CH) domain 


3 .3e~53 


190.2 


928 


WD40 


nw wvuiaxu, ur Octa repeat. 


5 . 9e-48 


172 . 7 


929 


zf-C3HC4 


******* ^Aiiyct f tJJiLi uyjut; \k.jlimv* 

finger) 


3 . le-10 


37.4 


93 0 


RlbUl_P_3_ep 
im 


Ribulose -phosphate 3 epimerase 
family 


7.2e-105 


36*1.8 


931 


Ribul_P__3 ep 
im 


piiuspiiaLC j epimerase 

family 


1 . 2e-96 


334.4 


936 


C2 


C2 domain 


2 . 2e-62 


220 . 7 


937 


NAP_family 


Nucieosome assembly protein 
(NAP) 


l.le-22 ""- 


U.t 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3.2e-07 ■ 


25.1 [ 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3 .4e-75 


263.2 


949 ] 


ND4 0 " i 


*D domain, G-beta repeat 


1 - 8e-27 


104.7 


950 

i 


kcyl transfer J 
ase 


'icyi transferase ~ ]~ 


l.6e-07 


38.4 
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SEQ ID 
NO: 


PFAM NAME 


LS**t<Jy— £1. X IT iiUJJI 


p-value 


PFAM 
SCORE 


951 


SAM 


SAM domain fStPT-itpk a inha " 
motif) 


■ 

0 a 0 14 


14 . 5 


954 


GFO IDH MocA 


Oxidoreductase family 


X*. X i 


CO n 


955 


BTB 


BTB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/poz domain " 


Tp. OO 


— sn 

ob . 1 


957 


CDP- 

OH P transf 


CDP- alcohol 

phosphatidyl transferase 


n nc7 
u . U5j 


-22 . 2 


959 


ras 


Ras family 


2.4e-97 


336.8 


960 


ras 


Ras f am 1.1 y 


8 - 4e-43 


155 . 6 


961 


Acetyl transf 


Acetyl transf erase (GNAT) family 


1.2e-08 


42.2 


962 


adh short 


oii^jj. u <_iicia.ii dcnyuroyenase 


2 - 4e-3 1 


117 . 6 


963 


mutT 


Da^tuiiai niuti protein 


5 . 6e- 06 


26 . 2 


969 


IF-2B 


Initiation factor 2 subunit 


8.4e-193 


653 .9 


970 


RNase PH 


a caui iuunucieaS€ camiiy 


9e-24 


92. 4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 


3 . 6e-21 


83.7 


978 


7 


niuubuiudo. protein ill / 


2 .4e-20 


81 , 0 


979 


IjIM 


LIM domain containing proteins 


5 . 8e-42 


152 .8 


980 


Calsecjuest ri 
n 


Cal segues tr in 


1 . 7e-297 


1001 .7 


982 


HSP20 


wsp^o/aipna crystallin family 


1 . 2e-10 


43 .2 


983 




NADH ubiquinone oxido reductase, 
20 Kd sub 


4 . 8e-63 


222.9 • 


988 


TBC 


-i b(- aomam 


2 .2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 j 


993 


o 


LKNA mtron enaonuclease 
— 


0 . 0017 


-34.2 


994 


homp oho v 


Homeobox domain 


4e-18 


73 .6 


997 


nvr v-o<H|/-i'V 


Pyridine nucleotide- di sulphide 

UAJ-UUitJUUCLd 


0 .012 


11.6 


1000 


mito carr 


Mitochondrial carrier proteins 


9 .7e-123 


421.2 


1001 


RA 


dfasogiaciOIl VKaJLljDc»/ Ar -o J 

domain 


1 . 2e-15 


65.4 


1004 


• DUF81 


Domain of unknown function 
DUF81 


0 . 099 


10.2 


1005 | 


act in 


Actin ' 


1 . 3e-174 


574 . 3 


1006 


actin 


Actin 


3 . le-130 


428 . 6 


1007 


cpn60 TCPl 


TCP— 1/ctJnGO Chanprnnin fam-llii 


3 . 7e-195 


661 . 8 


1008 


TPR 


TPR Domain 


8 . le-44 


159 . 0 


1009 


zf-C2H2 


Zinc finaer C2H2 hvrtP 


3 . 6 e— 61 


216 . 6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 


1012 


Zf-C3HC4 


Zinc fintrer C3HC4 t-vnp /»rwra" 
finger) 


4 . 7e-15 


53.1 


1016 


tRNA- synt_2c 


tRNA synthetases class 1 1 




55 . 2 


1018 


RhoGAP — — - 


RhoGAP domain " 


1 - 6e -78 


274 . 3 


1022 


PGAM 


Phosphogiycerate mutase family 


3 .8e-18 


69.7 


1026 


HMGjbox 


HMG ( hi ah mobil itv crrmin \ "h^w 


8 . 4e- 20 


79 .2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ con 




1 . 4e-49 


178 . 1 


1032 


PDZ 


*. is*-* uuuia All ^/UCiU «A fcj iJrtK 

or GLGF) . 


0.028 


16 . 3 1 


1034 


Hydrolase 


hydrolase 


2e-21 


84 . 6 


1037 


KRAB 


KRAB box 


4.8e-05 


32.4 


1038 


Cation^efflu 

X 


Cation efflux family 


7.ie-42 


152.5 


1040 


ART 


NAD:arginine ADP- 
ribosyl transferase 


4 -7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


1.9e-18 


74. 7 


1043 


Zt-C2H2 


Zinc finger, C2H2 type 


3 . 7e-24 


93 . 7 


1045 


lectin^c 


Lectin C-type domain 


L.9e-28 


108.0 


1046 < 


Glucosamine^ < 

ISO 


Glucosamine - 6 -pbospha t e 
Lsomerase 


3.00013 


-25.1 
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SEQ ID 
NO; 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
bCORh 


1047 


ligase-CoA 


CoA- 1 igases 


4 Se - 8 0 


279 . 4 


1049 




Immunoglobulin domain 


1.7e-09 ' 


35.6 


1050 


Ribosomal L2 
4e 


Ribosomal r»Y*r»t*<»-» n J .o a a 


2e-33 


124 . 5 


1054 


Amidase 


Amidase 




518 . 7 


1055 


ran 


RNA recognition motif. 


3 - 8e-26 


1UU . 3 


| 1058 


annexin 


Annexin 


6 . 9e-44 


ICQ -) 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


6.023 


T^t — Z 


1060 


hc-meobox 


Homeobox domain 


3 . 2e~3 1 


±! / - si 


1062 


Acyltransfer 
ase 


Acyl transferase 


0 OOOfi 1 ^ 

v * y*j vj \j d z> 


T7\ — E~ 


1064 


AMP-binding 


AMP-binding enzyme 


W * DC -L w KJ 


*3 d ^ i 

J Tt J . J 


1065 


LRR 


Leucine Rich Repeat 


3 . 3e - 14 


DU . O 


1066 


GTP1J3BG 


GTPl/OBG family 


4.8e-41 


141.8 


1071 






8 . 4e—4 8 


159.1 


1072 


PHD 


PHD- finger — 


6 . 8e— 07 


36 . 3 


1074 


DENN 


DENN (AEX-3) domain 


8.3e-33 


121.5 


1075 


SCP 




4 . 7e- 4 1 


149 . 8 


1077 


OLF 


Olt'actomedin-lifce domain 


2 .2e-66 


234.0 


1078 


mi to carr " 


iixwu^nonunai carrier proteins 


le-42 


149 .3 


1079 


WD4 0 


WD domaan, G-beta repeat 


6 .2e-45 


162 .7 


1007 


START 


APT Hnrni* •< r, ■ - 


1 . 5e-48 


174 .7 


1093 


DSPC 


uudx speciricicy pnospnatase, 
catalytic doma 


3 . 3e-63 


223 .4 


1094 


GSHPx 


Glutathione peroxidases 


9 . 6e-41 


148.8 


1095 


DUF25 


Domain o£ unknown function 
DUF25 


2e-75 


264.0 


1096 


DUF2 5 


Domain of unknown function 
DUF25 


6e-75 


262.4 


1105 


Ni 1 r o reduc t a 
se 


uiitroteauccaee tamiiy 


1 . 3e-13 


58.6 


1106 


PTE 


rnoBpnotnestexase ramiiy 


1 . 3e-179 


610.1 


1107 


DADKc 


Diacylgiycerol kinase catalytic 

("9 OTTO 1 n 


0,00049 


19.6 


1109 


ras 


Dae p a rrl 'f \r 
«ao *_ dill -L J. Y 


1 . 3e-15 


40 . 7 


1115 


ArfGap 


Putative GTP-ase activating 

piutein tor All 


9.7e-47 


168.7 


1116 


HMG14 17 




4.4e-21 | 


83 . 5 


1117 


HMG14 17 


HMG1 A anrt WMfJl 1 — 


9 . 9e-12 


52 . 4 


111S 


FAA_hydrol as 
e 


FlimaiVlaCPhoaf'el'ahn /rjnt 

hydrolase f am 


2e- 63 


290 .6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 


1123 


abhydrolase | 


alpha/beta hydrolase fold 


9.2e-23 


89. 0 


1129 


pro_isomeras 
e 


prolyl cis-tr 


2 . 2e- 56 


197 . 1 


1131 


DnaJ 


DnaJ domain 


1.6e-30 


114 .9 


1132 


WD40 




1 . 3e- 19 


78 . 6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64 .9 


1134 


PH 




0 . 001S 


17 . 8 


1136 


Adap comp su 
b 


subun i t f ami ly 


l . 2e-256 


866 . 0 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708.8 


1139 


ras 


Ras {family 


1 . 5e-86 


301 . 0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258 .4 


1152 


Acyltransfer 
ase 


Acyltransf erase 


1.2e-05 


29. 9 


1153 


IRS 


PTB domain { IRS- I. type) 


5 .4e-5S 


196.1 


1155 


is 


Immunoglobulin domain 


1.3e-31 


106.9 


1157 

1159 "J 


Asparaginase 
_2 


Asparaginase 


S.4e-72 


252.3 




SMC oxred ( 


3MC oxi dore duct ase a ' i 


2 . 7e-142 


485.3 


"1160 


sr-ANl 1 2 


Wl-like Zinc finger < 


5.00021 : 


27.9 
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SEQ ID 
NO: 


PFAM NAME 




1 p -value 


PFAM 
SCORE 


1163 


linker histo 
ne 


linker* h"f «;trm<» Mi an/i uc f 3 mi 1,. 


1 3 • 8e- 14 


60.4 


1164 


DED 


Death effector domain 






1165 


IRS 


PTB domain URS-1 type) 


2.6e-43 


157.3 


1166 


IRS 


c -± t*s \*ksuic* 1 1 v *>o — j. i»yjuc/ 


j 2 . 6e-43 


157.3 


1168 


SAM 


SAM domain fStMHl** ^inha 
mot if) 


i n ha 


10.5 


1170 


abhydrolase 


alpha/beta hydrolase fold 


1 n nod 

1 W.W5B 


_ 

-7.5 


1174 


SAP 


SAP domain 


1 *5 Qc _ i n 

1 J . 36 - JL V 


47,1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112,5 


1178 


WD40 




| 4 . 7e-3 5 


129 . 9 


1180 


Ets 


Ets — doma*i n 


| 1 . 8e- 09 


33 . 3 


1181 


Collagen 


\>vnavj cjj iriipie xiexix rspese 
(20 coia i f» <? } 


1 0 . 00016 


24 . 7 


1182 


TCL1JV1TCP1 


TCLl/MTCPl family 


9.5e-S6 


198.6 


1184 


RasGEF 


ncibuar QOrna 1 n 


J 1 . 7e- 88 


307 . 4 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217 .3 


1187 


UPAR 


u-fAK/ijy-b domain 


1 0.0042 


15.6 


1188 


Orn__DAP_Arg 
deC 


Pyridoxal -dependent 
deca rboxy 1 a s e 


6.2e-128 


430 .6 


1193 


Stat hm i n 


otatfimin raniiiy 


1 1.8e-90 


314 .0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1195 


Seel 


Seel family 


| 3 .2e-183 


622 .1 


1196 


pyr^re qox 


Pyridine nucleotide -disulphide 
ox,i.(iureQUCLd 


3.1e-32 


111.8 


1197 


Glyco_transf 

g 


Glycosyl transferase family 8 

- 


1.2e-09 


45 .5 


1202 


V l-nf. w»A 


K+ channel tetramerisation 

UORIain 


0.022 


-16.8 I 


1203 




short chain dehydrogenase 


B.3e-45 


162 .3 


1206 


Ubie mo t* V»vrl *- 

sran 


ubiE/C0Q5 methyl transferase 
family 


1.3e-121 


417.4 


1208 


7tm 3 


7 transmembrane receptor 


7.2e-09 


29 . 0 


1209 


ank 


Anlc rartAA f* 


3 . 9e-15 


63 . 7 


1210 


vATP— 
synt_AC3 9 


ft Ar syntnase \t-/AL.JS; suounxt 


2 . 5e-128 


439 .7 


1212 


zf-C2H2 


Zinc f inaer , pouo t-vmo i 


5 . 5e-17 


69 .3 


1213 


et"hand 


EF hand 


3.2e-07 


37.4 


1219 


rrm 




2 . le-40 


147 . 7 


1220 


DUF6 


Aiiucyioi uiczuu-jx ajits protein UUPq I 


0 . 01S 


21 . 5 


1222 


SCAN 


SCAN domain | 


i.5e-7l 


251.1 | 


1223 


G- gamma 


GGT« H/~im?) 4 n 1 


3 . Se-36 


129 .5 


1227 


catalase 


Catalase | 


0 


1158.9 


1232 


PX 




2 . 2e-lS 


64 .5 


1233 


PX 


vavJUlc* 1.11 j 


2 . 2e-15 


64 .5 


1236 " ' 


FCH 


Pes/CTP4 hoffml rymr Hnma 4 n li 


3 . 3e-09 


44 . 0 i 


1241 


Peptidase M2 
0 


PeotiHa«?p faun' 1 v uTn /mtc /kaa r\ 1 


2e-63 


224 . 1 


1243 


WW 




0.044 


17 . 9 


1247 


UPF0006 


Metalloenzyme of unknown 
function UPF0006 1 


6.3e-61 


215.8 


1248 


Glycos trans 
£ 2 


Glycosyl transferases j 


4,5e-10 


46.9 


1249 


efhand 


EF hand j 


4e-ll 


50.4 


1254 


uq con 


w« a qui c in - con j uja c l ng enzyme j 


2 . le-73 


257.3 


1255 




t\o.o lamiiy i 


2 . 2e-62 


220.7 


1256 


rormyi trans 
f 


Furmyi transferase T 


4 .9e-30 


108.3 


1259 


z£-C3HC4 


Zinc finger, C3HC* type (RING 
finger) 


5.3e-13 


46.4 


1261 - 


DiHfolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


3 glu transp 
apt 


Samma- glutamyl transpeptidase 


1.8e-110 


380.4 


1263 


PAS j 


PAS domain [ 


1.3e-08 


36.9 


1265 ] 


LiRR ] 


leucine Rich Repeat | , 


J.2e-22 | 86.9 
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N\J i 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


6e-29 


108.0 




K_tetra 


K+ channel tetxamerisation 
domain 


2.8e-27 


104.0 


1269 


ras 


Ras family 


1 - 3e-85 


297 . 9 


1275 


ZX — LJnL4 


Zinc finger, C3HC4 type (RING 
finger) 


4 . 2e-10 


37.0 


12 76 


r-r— 5 = 


alpha/beta hydrolase fold 


5 .4e-23 


89.8 


1277 


abhydrolase 


alpha/beta hydrolase fold 


5.6e-21 


83.1 




trypsin 


Trypsin 


4 .4e-41 


132. 0 


■toon 


PBP 


Phosphatidylethanolamine- 
binding protein 


1.3e-13 


58.7 


1285 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Ank repeat 


1 . 7e-52 


187 . 8 


1294 


£n3 


Fibronectin type III domain 


0 . 026 


20 .9 


1295 


GBP 


Guanylate -binding protein 


0 . 00026 


-70. 0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3 . 2e-14 


60.7 


1298 


LIM 


LIM domain containing proteins 


5.8e-2l 


79.1 


1301 | rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


1307 


raito^carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


1.6e-17 


71.6 


1310 


UPAR LY6 


u-PAR/Ly-6 domain 


7. le-20 


75.5 


1313 


thiored 


Thioredoxin 


3 . 6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


1316 


trypsin 


Trypsin 


4.4e-41 


132.0 


1320 


RjLbOB omalgia 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadil 1 o_s e 

g 


Armadillo/beta-catenin-like 
repeats 


0.0054 


23.4 


1328 


KRAB 
— — — - " 


KRAB box 


0.052 


-5.6 


1329 




RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bel -2 family 


0.014 


-1.6 


1331 


PX 


PX domain 


2.1e-10 


48.0 


1333 


KRAB 


KRAB box " 


1 . 8e-36 


134 .6 


1334 


UPP_syntheta 
Be 


Putative undecaprenyl 
diphosphate synt 


2.3e-89 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1.8e-59 


211.0 


1336 


DSPC 


Dual specificity phosphatase, \ 
catalytic doma 


1 .2e-31 


118.6 


1337 


DSPC 


Dual specificity phosphatase, 

/*• aha! ^ » f- •! 
udtdiytlC uOma 


2 . 3e-12 


54 .5 


1338 


TPR 




0 . 00021 


28.1 


1340 


metal thio 


iic u ax xuLiiioucin 


0 . 013 


20 . 3 


1341 


mutT 


Bacterial mutT protein ] 


5. 8e-09 


36.5 


1343 j 


Band_41 


rLKn aomain ibanct 4.1 family) 


1 ,3e-38 


122 . 5 


1344 


Kelch 


Kelch motif 


1 .4e-44 


161.5 


1345 


Antifreeze 


Antifreeze protein 


1 .2e-10 


48.8 


1347 


3Beta_HSD 


3 -beta hydroxys tero id 
dehydrogenase/ isomera 


0 .086 


-177.2 


1348 


BTB 


dxo/ iruz aoma in 


5 ,3e-28 


106. 5 


1349 


DUF6 


Integral membrane protein DUF6 


0 - 033 


15. 8 


1350 


myosin_head 


Myosin head (motor domain) 


o 


2083 


1352 


Nramp 


Natural resistance-associated i 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 


209.0 


1356 


C2 


C2 domain 


2 .4e-15 


64 .4 


1357 


RBD 


Raf-like Ras -binding domain 


4 .2e-57 


203 .1 


1360 


ZX-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-40 


145. 7 
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SEQ ID 

NO: 


PRAM MUMP* 




p-value 


PFAM 
SCORE 


1362 


SIS 


Olu ViCJIIlCtiil 


3 . 8e-30 


113 . e 


1363 


SIS 




1 . 3e-28 


108 . 5 


1364 




Til ifl 11 1T1 <"»<"tl f"»V"»l 1 1 1 n r^nina v\ 


0 . 00026 


19 .0 


1368 


K tetra 


K+ fliani1(*l t" t* r-a mo *» 1" a = t- -i r*.ti 

domain 


1 - le- 16 


68 . 9 


13 71 


Collagen 


Collacren trinlf* np=>l "i v rpripa 
(20 copies) 


2 . 2e~ 113 


390 . 1 


1372 


Dnatf 


Dna«J domain 


o . be- jo 


132 . 7 


1376 


KRAB 


KRAB box 


5 — -y^ in 


141 . 0 


1378 


ELM2 


ELM 2 domain 




91 . 3 


1386 


thiored 


Thxoredoxin 


1 . 2e-23 


82 . 8 


1381 


ank 


4^AX#V -i- ^ £S ^— - CA L_ 


2 . 3e-83 


290 .4 


1382 


BTB 


BTB/POZ domain 


3 e- 11 


50 . 8 


1383 


WD40 


WD domain f3 — hAha vAr%aai- 


1 . 6e- 19 


78 .3 


1384 


WD40 


nrt/ Qoraaxn, tx-joeta repeat 


6 . 3e-24 


92 .9 


1387 


zf -C3HC4 


£,inc linger, wnL^ type (RING 
f inger ) 


1 . le-09 


35 .6 


1389 


Zf -C2H2 




5 . 5e-50 


179.5 


1390 


zf~C2H2 


Zinc finger, C2H2 type 


2.Se-85 


296.9 


1393 


kineo in. 


Kincs in motor domain 


7 . 8e-188 


637 .4 


1394 


zf -C2H2 


£txnc linger, <-2ri2 type 


1 ,2e-49 


178.4 


1398 


KRAB 


KRAB box 


5 . le-22 


86\6 


1402 


bZIP 


bZIP transcription factor 


0 . 035 


13 .1 


1405 


fiucrair f t* 

0 C* ^ V* -A. 


Sugar (and other) transporter 


0 .003 


-101 .5 


1406 




RhoGAP domain 


8 . 9e-47 


168.6 


1407 




RNA recognition motif. 


le-35 


132.1 


1408' 


LRR 


Leucine Rich Repeat 


2 . le-13 


58 .0 


1409"" 


a t 


Nebulin repeat 


6e-54 


192.6 


1410 


ank 


niiK r epe a c 


1 .6e-17 


71.6 


1412 


_c 


ribosomal L5P family C- terminus 


8 ,2e-58 


205.5 


1415 


t rypsin 




4 . 7e-85 


270 .4 


1416 


aminotran 1 


Aminotransferases class-I 


4 .4e-05 


-91.2 


1417 


si 


si rna binding domain 


i:6e-C7 


33 .1 


1419 


WD4 0 


WD domain, G-beta repeat 


2 .2e-09 


44.6 


1422 




i.danerin aomam 


8 .3e-42 


152.3 


1424 


SH3 


oftj uOulain 


2 . 5e-B0 


280.3 


1425 


PHD 


cnu T. lug e r 


3 .2e~17 


70.6 


1426 


PHD 




3 . 2e-17 


70 .6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1428 


helicase C 


Helicases conserved C- terminal 


le-26 


102.2 


1429 


WD40 


WD doma in CI— Vio t* a r-t*Y\ oar 
^vino xu t \y ua l ci repeat 


3 . 9e-07 


37.2 


1430 


Inositol P 


luwoituj. tautiopnospnoiCase rarniiy 


2 . 5e-10 


40.2 


1431 


raito carr 


Mitochondrial nnyvi e*t~ orfthoi no 


4 . 3e- 83 


287.7 


143 3 


Clq 


Clcr domain 


2 . 9e-16 


66 . 2 


1434 


WD4 0 




J. . bc-lj 


58 . 3 


1435 


Inos-i- . 
P_synth 


synthase 


7e-228 


770 .4 


1435 


rrm 


RNA recognition motif. 


1 4e-34 


12 8 . 3 


1438 


ig 




i • Jc - X^s 


45 . 6 


1440 


G_Adapt_CT 


Gamma -a da tit in. c-tprmimm 


3 . 4e- 67 


236.7 


1441 


G_Adapt CT 




3 . 4e - 67 


236.7 


1443 


KelcK 


Kelch motif 




28.7 


1446 


ARID 


ARID DNA binding domain 


1. 8e-21 


84 .7 


1447 


zf-C2H2 


Zinc finger, C2H2 type 


9 .4e-28 


105.6 


1448 


AMP- binding 


AMP-binding enzyme 


2.6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6.Se-2l 


82 .9 


1454 


*9 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyl trans f 


Sialyl transferase family 


5.4e-21 


83.2 


1460 


ftldose epim 


Aldose l-epimerase 


1 .9e-3£ 


131. 2 


1461 


C2 


C2 domain I 


4e-18 


73 .6 


1470 


riG 


IPT/TIG domain 


3.le-19 


77 .3 


| fseudou synt ] 


RNA pseudouridyiate synthase 


4 .3e-16 


66.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


| p-value 


PFAM 
SCORE 




n 2 








1474 


DENN 


DENN (AEX-3) domain 


1.3e-44 


161.6 


1475 


"Cation_e£t"lu 

X 


CLJLXUA iailUly 


J 4 . 6e-49 


176 . 4 


1477 


TBC 




1 Q e» A *7 

1 oe-4 / 


169 . 0 


1478 


rrm 


RNA recognition motif. 


1 2e-21 


84 .6 


1480 


ig 


illllllUll^U J_ UiJ Ul J.li UUIIItJ J. LL 


c — EI — 7w? 

j 5.5e-06 


24 .3 


1484 


Telo_bind_al 
pha 


Telomere-binding protein alpha 

suli i_in"i 


0.028 


-225.9 


1485 


Zf-C2H2 


Zinc finger, C2H2 type 


1.8e-68 


240.9 


1486 


pkina.se 


cuK.diygt.ic protein Kinase 


9.5e-13 


49.9 


1488 


helicase_C 


He li cases conserved C- terminal 


1.4e-15 


65.2 


1483 


DUF89 


Protein of unknown function 
DUF89 


" 0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydra tase/isomerase 
family 


5 . 2e-41 


149. 7 


1491 


guanylate cy 


«ucayidte ana buanyiace cyclase 
catalyt 


5 . 9e-46 


166 .1 


1492 


LRR 


Leucine Rich Repeat 


| 3.4e-l9 


77.2 


1495 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7 . le-10 


36 .3 


1497 


pJcinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 


1500 


SH3 


SH3 domain 


1 9.3e-05 


27.2 


1502 
1503 


homeobox 
homeobox 


Homeobox domain 
Homeobox domain 


| 0.084 
J"0.0B4 


13.8 
13.8 


1505 
1506 


EGF 
UCH-2 


EGF- like domain 

Ubiquitin carboxyl- terminal 
hydrolase family 


2,7e-23 
2.7e-21 


90.8 
84.2 


T508 

1511 
1512 


0 

FX 

suifatase 


Peptidase family M20/M25/M40 

PX domain 
Sulfatase 


2 . 8e-28 
1.9e-ll 


101.8 


1516 
1518 

1520 


Syntaxin 
amino tran__3 

i9 


Syntaxin 

Aminotransf eraoeo class- III 
pyridoxal-pho 


2. 8e-35 

0.011 

9.7e-106 


130.7 
-62.3 
305.6 


1521 


RA 


Atiiuiunuyj.uiJU.1 III UUlUdXH j 

Ras association (RalGDS/AF-6) 
domain j 


0 .075 
0.013 


11.0 
13:3 


"1523 
1528 
1535 


KhoGAP 

WD40 

IMS 


RhoGAp domain 1 
WD domain, G-beta repeat [ 
impB/mucB/samB tamlly | 


2 . 5e-05 
5.4e-24 


18 . 7 
93 . 1 


1538 
1539 

1540 


FYVE 
DAGKc 

Ocular_alb j 


FYVE zinc f inapy 

Diacylglycerol kinase catalytic 
domain | 
Ocular albinism type 1 protein | 


7.8e-95 
3 . 2e- 27 
6e-07 

0 


328.5 
101.5 
36.5 

1184.7 


1653 
1654 


SAP 

Am i no_oxi ti a s 
e 


Flavin containing amine oxidase 


6e-06 
3.2e-43 


33 . 2 
157.0 


1655 
1656 


Amino_oxidas 
e 

RnoGEF 


Flavin containing amine oxidase 1 
RhoGEF domain | 


3.2e-43 
1.4e-24 


157.0 
95.1 


1657 
1659 

1660 


MMK_HSR1 
UCH-2 " 

actin 


GTPase of unknown function |j 
Ubiquitin carboxyl- terminal J 
hydrolase family | 
Actin -f- 


0.0011 
2.5e-ll 

6.6e-21 


-45.5 
51.1 

69 .9 


1661 
1662 

1663 


BAH 
vwa 

WD40 


BAH domain 

von Willebrand factor type A 1 
domain 1 
/ID domain, G-beta repeat j 


1.7e-82 
0 

1.4e-67 


287 . 5 
1909.4 

237.9 


1667 
1669 

1671 


z£-C2H2 

Noil__Nop2_Su 

a 

*H2 


Zinc finger, C2H2 type | 
tTOLl/NOP2/sun family "■ r" 

arc homology domain 2 | 


1.3e-93 
l.3e-23 

5.4e-l5 


324 .4 
34 . 3 

16.9 
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NO: 


PF&M MUMP 


UcibLKlrl ION 


p-value 


PFAM 
SCORE 


1672 


chromo 


Oraanization Modifier) 


2 - ie-18 


67 . 7 


1674 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


1*7 c 


"1676 " 


Glyco hydro 
47 


Glycosyl hydrolase family 4 7 


1.8e-167 


636 .2 


1677 


Glyco hydro 
47 


Glyco syl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105 . S 


1681 


WD40 


WD domain, G-beta repeat 


1 . J.c c. f 


105 . 5 


1683 


MMR HSR1 


GTPase of unknown function 


1 . 8e— 78 


274 . 1 


1691 


rrm 


RNA tp man "f t" i r\r\ mo t" i' f 


i . oe — o / 


137 . 9 


1692 


rrm 




1 • oe- j / 


137 . 9 


1693 


AAA 


cellular act 


1 . 3 e - 81 


284 . 5 


1697 


Ferric reduc 
t ~ 


t ran smetnbr ane com 


8 . 4e- 82 


285 . 2 


1698 


Ferric_reduc 
t 


Ferric reductase 1 
transmembrane com 


"5 do C 1 
S • 3c - 


190 . 1 


1699 


2f-C2H2 


Zinc finger, C2H2 type 






1700 


arf 


ADP-ribosylation factor family 


9e-19 


75.8 


1702 


GTP_EFTU 


ciivnyciLXUD Ld(_ UUi XU 1 3 mi JL y 


0 . 014 


11 .4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194.4 j 


1707 


pkinase 


ciusairyoLic pr OlcIIj K 1119 Se 

domain 


1 . 2e- 88 


307.9 


1709 


WD40 


»*■— 7 uuiuojLiif w ~~ uc t- ct rcpsac 


0 . 0035 


24 . 0 


1710 


LRR 


riP'll^l TIP 1? HI fY\ Danaal- 


1 . 2e-30 


115 . 3 


1711 


WW 


WW domain 


7.6e-12 


52.8 


1712 


ank 


-VTjn.t repcac 


4 . 2e-34 


126. 7 


1713 


zt-CCCH 


Zinc finger C-X8-C-X5 -C-X3-H 
typs 


2 .6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-xS-C-x5-C-x3-H 
type 


2.6e-09 


3B.3 


1715 


ras 


Baa f anri5 1 -vr- 


4 . 4e-41 


149 . 9 


1718 


HMG_box 


mauiiity y roup J cox 


B . 3e-21 


82 . 6 i 


1719 


TBC 




1 . le-45 


155.2 j 


1721 


KLH - 


Helix- loop -helix DNA- binding 

dfimi* *i n 


9.2e-l0 


45.9 


1723 




i/uujjxc-oc_aiiaea ksia Dinaing 
motif 


2 . 9e-05 


30.9 


1724 




*vAj«^yJotJiu<i j. rciNrt QQcnillc 

dimethyl a ses 


0 . 045 


9.2 


1725 


CIDE-N 




5 . 9e-40 


145 . 2 


1726 


HAT 


HAT (Half-A-TPR) repeats 


2 . 9e-44 


160 . 5 


1728 


efhand 


BF hand 




79.9 


1733 


Hist deacety 
1 


His tone deacetylase family — ' 


1 7e— 104 


3 60 . 6 


1735 


LRR 


Leucine Rich Repeat 


4 . 6e-34 




1739 


PI-PLC-X 


Phosphatidyl inositol -specific 
phpspholipase j 


0 . 0023 


AO . 1 


1743 


ras 


Ras family 


3 . 7e-10 


-21.3 


1744 


ras 


Ras family 


3 . 7e-10 


-21 . 3 


1745 


RasGEF 


RaaGEF domain 


3 .2e-49 


176 . 9 


1746 


adh_short 


short chain dehydrogenase 


7 . le-08 


34 . 6 


1751 


zf-C2H2 


Zinc finger, C2H2 type 


9e-39 


142 . 2 


T7S4 


tr>3 


Fibronectin type III domain 


5 . 5e-101 


348 . 9 


1756 


2f-C2H2 


Zinc finger, C2H2 type 


6.3e-93 


322.1 


1758 


rrm 


RNA recognition motif. 


0.017 


21.2 


1760 


Nop 


Putative snoRNA binding domain 


6.le-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


376*5 


MMR HSR1 


GTPa3e of unknown function 


6.4e-41 


149 .4 


1769 


CN_hydrolase 


Carbon- nitrogen hydrolase f 


3e-06 


-43 .9 


1775 


ank 


Ank repeat 


4 .le-07 


37.1 


1779 


Dxysteroi BP 


□xy sterol -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 


1784 J 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motir. 


6.4e-14 


59.7 



TRADOCS: 1 4 ! 6227. 1 (%CRN0 1 LDOC) 
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TABLE 5 



SEQ ID NO: 


" POSITION OF 

SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1 


1-21 


0 .991 


0.955 


2 


1-31 


0.995 


0 . 944 


3 


1-33 


0.949 


0.736 


4 


1-19 


0.970 


0.951 


5 


1-26 


0.971 


0. 863 


6 


1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 


'8 


1-26 


0.971 


0.863 


"9 


1-46 


0.982 


0.901 


10 


1-21 


0.991 


0.955 


11 


1-23 


0.989 


0.899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0 .938 


0.876 


15 


1-25 


0.941 


0 .811 


16 


1-17 


0.972 


0 . 939 


17 


1-27 


0 .964 


0 .777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0 .953 


0 .840 


20 


1-20 


0.935 


0.701 


21 


1-22 


0,974 


0 . 850 


22 


1-33 


0 .9G1 


0 .895 


23 


1-19 ■ ■' 


0.991 


0 .959 


24 


1-31 


0.995 


0 . 944 


25 


1-22 


0.976 


0.935 


26 


1-27 


0.996 


0 .928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0. 986 


0 . 841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0. 993 


0 . 976 


32 


1-22 


0.998 


0 . 909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.73 6 


"46 


1-19 


O.S70 


0.951 


67 


1-25 


0.968 


0.848 


71 


1-18 


0.949 


0 . 845 


72 


1-30 


0.991 


0 .919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0 .986 


0.945 


94 


1-3 3 ~ ' 


0.994 


0.943 | 


97 


1-46 


0.964 


0.595 


103 


1-49 


0.983 


0.570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.989 


0.899 


126 


1-25 


0.955 


0.803 


129 


1-19 ^ 


0.963 


0.918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0.628 


148 


1-20 


0.969 


0.904 | 


156 


1-25 


0.941 


0.811 


158 


1-22 


0.979 


0.927 j 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 j 


1-16 


0.939 


0. 826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 " - 


0.825 


180 


1-27 


0.981 


0. 941 


187 


1-28 


0.982 


0. 936 


190 j 


1-19 


0.953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0 . 963 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


SCORE ) 


rieano ii*i£»Ar« 


199 


1-20 


0.935 


0.701 


200 


1-23 


0.977 


0 . 773 


206 


1-30 


0,984 


0 . 890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0 .974 


0 , 850 


210 


1-40 


0.940 


0 . 670 


211 


1-28 


0 .971 


0 .849 


216 


1-24 


0.986 • 


0 . 956 


21 B 


1-33 


0.961 


0 . 895 


219 


1-19 


0.970 


0 . 871 


221 


1-19 


0.904 


0.553 


222 


1-21 


0.917 


0 . 555 


230 


1-19 


0 . 991 


0 . 959 


231 


1-26 


0 .953 


0 . 800 


232 


1-25 


0 .988 


0.826 


239 


1-23 


0 . 969 


0 . 82 8 


240 


1-17 


0 .982 


0 . 955 


241 


1-17 


0 .982 


0 . 955 


245 


1-30 


0 . 970 


0.722 


24 8 


1-22 


0.976 


0 . 935 


249 


1-23 


0.968 


0 . 94 0 


252 


1-18 


0.971 


0 . 923 


261 


1-24 


0 . 883 


0.587 


265 


1-18 


0 . 939 


0.868 


272 — 


1-24 


0.953 


0 . 73 9 


283 


1-21 


0.906 


0 6*88 


284 ■ ■ " 


1-29 


0 . 997 


need 


290 


1-31 


0 .986 


0 84 1 


302 


1-28 


0.960 """ 


0 . 893 


304 


1-16 


0.907 




312 


1-19 


0 . 993 




313 


1-17 


0 . 930 




323 


1-22 


0 . 998 


n q n q 


324 


1-17 


0.982 


0 . 954 


328 


1-19 


0 .971 


0 I 865 


329 


1-22 


0.963 


0.924 


330 


1-33 


0 .978 


0.841 


331 


1-24 


0 . 920 


0 . 712 " 


332 


1-24 


0 .575 


0.881 


TFT 


1-19 


0 .984 


0.941 


334 


1-20 


0.899 


0 . 567 


335 


1-27 


0 .942 


0.813 


336 


1-20 


0.952 


0 . 850 " 


337 


1-38 


0.942 


0.653 


338 


1-27 


0.973 


0 . 772 


339 


1-36 


0.979 


0 . 804 


340 


1-27 


0.888 


0.597 


343 


1-19 


0.971 


0 . 865 


344 


1-22 


0.994 


0 . 928 


345 


1-17 


0.966 


0 .687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0.963 


0 .924 


349 


1-24 


0.982 


0 . 966 


351 


1-21 


0.918 


0 . 815 


352 


1-31 


0 . 988 


0 . 912 


354 


1-31 


0.914 


0.839 


35S 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


350 


1-27 


0."938 ■ " " 


0.827 


361 " 


1-25 


0.954 


0.674 


362 


1-22 


0 . 929 


0 .788 


3 63 


1-21 


0.681 


0 . 715 


364 


1-33 


0.978 


0 . 841 


365 


1-33 


0.978 


0.841 
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SEQ ID NO: 


POSTTrON OP 
* wo x x wii wr 

SIGNAL IN AMINO 

ACID SEQUENCE 


MaxS (MAXIMUM 
SCOR3) 




366 


1-21 


0.916 


0.820 


367 


1-19 


0.936 


0.822 


368 


1-29 


0.972 


0 . 874 


370 


1-24 


0 . 920 


0 .712 


371 


1-24 


0.961 


0.773 


372 


" 1-27 


0 .919 


0.768 


373 


1-19 


0 .986 


0 . 945 


375 


1-32 


0 .994 


0 .932 


376 


1-34 


0 .987 


0.810 


377 


1-17 


0 .995 


0 .950 


378 


1-49 


0 . 971 


0.749 


380 


1-20 


0 . 968 


0.8 74 


381 


1-20 


0 . 928 


0.782 


382 


1-19 


0 . 986 


0.934 


383 


1-28 


0 . 965 


0.829 


384 


1-39 


0 . 970 


0.551 


386 


1-24 


0 . 975 


0 881 


388 


1-3 0 


0.989 


0 . 8 6 8 


389 


1-19 


0 . 984 


0 94 1 


390 


1-26 


0 . 971 


C\ 1 ftO 

u . / o« 


392 


1-20 


0.981 


0 900 


393 


1--6 


0.968 


0 890 


394 


1-23 


0 . 93 7 


0 . 701 


3 97 


1-22 


0 . 985 


A ft K A 


399 


1-46 


0 .977 


0.698 


401 


1-20 


0.899 


0 . 567 


402 


1-22 


0 . 96*7 


0 . 93 1 


403 


1-27 


\J . J Z? 


0 . 934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0 . 994 


0.921 


407 


1-35 • 


0.987 




"408 


1-39 


0 . 976 


0.551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0 . 990 


n oco 


411 


1-38 


0 . 977 


n im 

U . 04 / 


412 


1-20 


0.944 


t\ n a Q 
O . /bo 


413 


1-20 


0 . 988 


0*965 


414 


1-46 




n c i q 
U . b J o 


415 


1-23 


0 . 981 


\j . y *± u 


417 


1-29 


0 . 941 


u • b f 4 


418 


1-20 


0 . 952 


0 . 850 


419 


1-19 


0.986 


0.967 


420 


1-29 


0 . 965 


0.861 


421 


1-22 


0.889 


0 . 785 


422 


1-48 


0 .982 


0 . 862 


424 


1-19 


0 . 979 


0 . 933 


428 


1-38 


0 .942 


0 . 653 


430 


1-18 


0 .947 


6. 595 


432 


1-33 


0.957 


0 . 789 


433 


1-26 


0 . 979 


0 . 904 


434 


1-27 


0.962 


0 . 777 


435 


1-24 


0.998 


0.977 


43* 


1-27 


6.973 


0.772 


443 


1-15 


0 . S66 


0.940 


448 


1-36 


0 . 979 


0 . 804 


453 


1-41 


0.9S8 


0.609 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 * 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 " 


0.636 


498 


1-26 


0 . 993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 ' 


0.966 " " 


0.687 


"510 


1-23 


0.930 


0.593 
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SSQ ID NO: 


Jt*UO-L i. -LLALN Uf 


NdXb \ MAXIMUM 


Means (MEAN 




SIGNAL IN AMINO 


SCORE \ 






ACID SEQUENCE 






SIX 


1-23 


0 . 930 


0.593 


S12 


1-23 


0 .930 


0."593" " 


SIS 


1-18 


0 .978 


0 . 956 


523 


1-19 


0.93 6 


0 . 822 


529 


1-22 


0 .963 


0 . 924 


545 


1-24 


0 .982 


0 . 9^6 


550 


1-30 


0 .933 


0 . 713 


552 


1-21 


0 .973 


0 .912 


554 


1-23 


0 . 969 


0.784 


571 


1-21 


0 .918 


0.815 


574 


1-31 


0 . 988 


0 . 912 


580 


1-39 


0 .925 


0.556 


594 


1-31 


0 . 974 


0 . 839 


608 


1-29 


0 .932 


0 . 632 


609 


1-29 


0.932 


0 . 632 


610 


1-21 


0.990 


n QAQ 
v . y*i © 


621 


1-15 


0.994 




623 


1-33 


0 . 935 




653 


1-27 


0 . 93 8 


n oil 


668 


1-22 


0 . 929 


0.788 


677 


1-16 


0.948 


nam 


685 


1-21 


0 881 


0.715 


"699 


1-22 


0 . 975 


n qi c 
v. oio 


702 


1-31 


n oca 


0 . 898 


707 


1-16 


0 8 6 0 


v . DbZ 


713 


1-25 


0 . 966 


0 . 743 


718 


1-19 


0 936 


0 . 822 


719 


1-2 0 


n qci 

U ■ 79 X 


0 . 824 


729 


X-29 




0 . 874 


735 


1-46 




0 . 598 


746 


1-14 


u . -9 J. o 


0 . 7j 0 


747 


1-22 


0 OfT?" 

\f . 7D3 




748 


1-2 9 


U - O 


0 . 785 


759 


1_24 


n Qci 

yJ m J. 


0 . 773 


767 


1-2 7 


0 . 919 


0.768 


768 


1-33 


r> qa n 
u . ?uu 


0 . 585 


773 


1-42 


0 . 959 


n mi 


779 


1-19 


0 . 986 


U . if 1 * D 


797 


1-19 " 


0 » 94 4 


n icq 


798 


1-19 


fj # 5oo 


0 5^8 


820 


1-17 


0 . 995 




827 


1-49 


0.971 


0 . 749 


848 


1-20 


0 . 968 


0 . 874 


864 


1-20 


0 . 928 


0 . 782 


866 


1-19 


0 . 986 


0 . 934 


873 


1-23 


0 . 948 


0 . 886 


881 


1-28 


0.965 


0 . 829 


887 


1-39 


0 . 970 


0.551 


927 


1-30 


0.989 


0 , 868 


934 


1-48 


0.988 


0 . 777 


939 


1-39 


0 . 994 


0.889 


944 


1-26 


0 . 971 


0 . 782 


950 


1-29 


0.957 


0 . 845 


963 


1-20 


0 . 981 


0 . 900 


964 - - 


1-20 


0 . 886 


0 . 558 


973 


1-16 


0.968 


0 . 890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0 . 953 


0.822 


984 


1-12 


0.938 


0 . 780 


101S 


1-22 


0 . 985 


0. 854 


'1040 


1-46 


0 . 977 


0 . 698 


1052 ■ 


1-18 


0.969 


0 . 842 


1059 


1-20 


0.927 


0.867 


1065 


1-33 ■ 


0.983 


0.918 | 


1069 


1-22 


0.993 


0.935 
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i'SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS {MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1075 


1-27 


0.992 


0. 934 


1080 


1-19 


0.931 


0.829 


1092 


1-19 


0.991 


0.973 


1094 


1-46 


0.992 


0.653 


1095 


1-30 


0.974 


0.929 


1105 


1-23 


0.994 


0.921 


1123 


1-35 


0.987 


0.658 


1138 


1-32 


0.954 


0.613 


1140 


1-33 


0.989 


0.789 


1142 


1-33 


0.897 


0.570 


1152 


1-25 


0.990 


0.962 


1170 


1-38 


0.977 


0.827 


1176 


1-20 


0.944 


0.768 


1 1187 


1-20 


0.988 


0.965 


| 1189 


1-35 


0.967 


0.839 


[ 1192 


1-46 


0.993 


0 .63 8 


1193 


1-16 


0.925 


0.710 


1197 


1-29 


0.985 


0.853 


| 1208 


1-23 


0.981 


0.940 ! 


J 1225 


1-29 


0.941 


0.672 


1 1245 


1-19 


0.966 


0.967 


1258 


1-29 


0.965 


0.861 


1 1265 


1-22 


0 .889 


'"0.785 


1266 


1-20 


0 .944 


0.809 


1 1276 


1-48 


0.982 


0.862 


j 1292 


1-19 


0.979 


0. 933 


1296 


1-21 


0.984 


0.944 


1 1297 


1-19 


0.984 


0.953 


| 1332 


1-38 


0.942 


0 . 653 


1358 


1-18 


0.947 


0.595 


j 1371 


1-33 


0.957 


0.789 


1380 


1-26 


0.979 


0.904 


J 1397 


1-27 


0.962 


0.777 


1399 


1-23 


0.997 


0.960 


14 04 


1-24 


0.998 


0.977 


1410 


1-15 


0.946 


0.845 


1414 


1-24 


0.913 


0.588 


1 141S 


1-19 


0-982 


0.929 


1 1416 


1-12 


0.931 


0.891 


1418 


1-30 


0.933 


0.563 


pL420 


1-20 


0.881 


0.561 


| 1421 


1-19 


0.99O 


0.968 


14 23 


1-17 


0.968 


0.863 


(1424 


1-21 


0.885 


0.591 


1425 


1-24 


0.913 


0 .588 


1426 


1-24 


0.913 


0.5B8 


1428 


1-25 


0.957 


0.899 


1430 


1-34 


0.977 


0 .819 


1431 


1-28 


0 .979 


0 .923 


j 1432 


1-36 


0.957 


0.613 


I 1433 


1-32 


0.921 


0.753 


j 1434 


1-39 


0.983 


0.621 


1435 


1-25 


0.910 


0.631 


1436 


1-42 


0.988 


0.868 


1437 


1-22 


0 .998 


0.980 


1442 


1-20 


0.918 


0. 753 


1448 


1-12 


0.931 


0.891 


| 1462 


1-18 


0.968 r 


0.888 


i 14 90 


1-20 


0.881 


0.561 


1518 


1-1/ 


0.968 


0.863 


1525 


1-21 


0.885 


0.591 


1547 


1-28 


0.974 


0.891 


J-bbl 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0.824 


1593 


1-28 


3. 979 


0.923 
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SEQ ID NO: " 


POSITION OP 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS fMJXYTMITM 

SCORE) 


SCORE) 


1596 


1-16 


0 .929 


0 . 709 


1601 


1-36 


0.957 


0 .613 


1606 


1-22 


0.979 


0 . 831 


1607 


1-20 


0 . 974 


0 . 770 


1608 


1-32 


0.921 


0.753 


1614 


fl-33 


0 .969 


0.829 


1*1* 


1-20 


0.959 


0 .869 


1625 


1-39 


0.983 


0.621 1 


1632 


1-25 


0.910 


0.631 


1636 


1-33 


0.897 


0.591 


1639 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


1647 


1-17 


0 .923 


0.742 


1648 


1-22 


0.998 


0.980 



TRAD0CS: ! 4 1 6234. J (%CR%0 1 ! .DOC) 
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TABLE 6 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 

NO: of 

full^ 

length 

peptide 

sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number, 
corresponding 
SEQ ID NO: in 
priority 
application 


! SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1 


1787 


3573 


5359 


784CIP2_1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


2673 


3 


1789 


3 57,5 


5361 


784CXP2 3 


4117 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


5 


1791 


3577 


5363 


7B4CIP2 5 


5562 


6 


1792 


3578 


5364 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


784CIP2 7 


5562 


8 


1794 


3580 


5366 


784CIP2 8 


5562 


9 


1795 1 


3581 


5367 


784CIP2_9 


5563 


10 


1796 


3582 


5368 


784CIP2 10 


5564 


11 


1797 


3583 


5369 


784CIP2 11 


5565 


j 12 


1798 


3584 


5370 


784CIP2 12 


5689 


13 


1799 


3585 


5371 


784CIP2 13 


5729 


14 


1830 


3586 


5372 


784CIP2 14 


5745 


15 


1801 


3587 


5373 


784CIP2 15 


5777 


l£ 


1802 


3588 


j 5374 


784CIP2 l4 


5777 


17 


1803 


3589 


j 5375 


784CIP2_17 


5789 


18 


1804 


3590 


5376 


784CIP2_18 


5792 


19 


1805 


3591 


5377 


784CIP2 19 


5804 


20 


1806 


3592 


5378 


784CIP2J20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 


22 


1808 


3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


24 


1810 


3596 


5382 


784CIP2 24 


5850 


25 


1811 


3597 


5383 


784CIP2_25 


5867 


26 


1812 


3598 


5384 


784CIP2 26 


5973 


27 


1813 


3599 


5385 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 


29 


1815 


3601 


5387 


784CIP2_29 


6005 


30 


181S 


3 602 


538B 


784CIP2 30 


6007 


31 


1817' 


3603 


5389 


784CIP2_31 


6007 


32 


1618 


3604 


5390 


784CIP2 32 


6009 


33 


1819 


3605 


5391 


784CIP2 33 " 


£022 


34 


1820 


3606 


5392 


784CIP2 34 


6015 


35 


1821 


3607 


5393 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2 36 


6016 


37 


1823 


3609 


5395 


7B4CIP2 37 


6018 


38 


1824 


3610 


5396 


784CIP2_38 


6018 


39 


1825 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


784CIP2 40 


6023 


41 


1827 


3613 


5399 


784CIP2 41 


60 70 


42 


1828 j 


3614 


5400 


784GIP2 42 


6081 


43 


1829 


3615 


5401 


784CIP2 43 


6089 


44 


1830 


3616 


5402 


784CIP2 44 


6118 


45 


1831 


3617 


5403 


784CIP2 45 


6118 


46 


1832 


"3618 


5404 


784CIP2 46 


6130 


47 


1833 


3619 


5405 


784CIP2 47 


6177 


4 8 


1834 


3620 


5406 


784CIP2 48 


6189 


49 


1835 


3621 


5407 


"" 784CIP2 49 " 


6191 


50 


1836 


3 6-22 


5408 


784CIP2 50 


6204 


51 


1837 


3623 


5409 


784CIP2 51 . 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2 53 


6367 


i 54 


1840 j 


3626 


5412 


784CIP2 54 


6436 


55 


1841 


3627 


5413 


784CIP2 55 


6442 ■" 


56 


1842 


3628 


5414 


784CIP2 56 


6445 


57 


1843 j 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


784CIP2 58 


6458 


59 


" 1845 


3631 


5417 


784CIP2 59 


£458 
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SEO ID NO • 

of full- 
length 
nucleotide 
sequence 


SBQ ID 
NO: of 

full- 
length 
peptide 
sequence 


of contig 

nucleotide 

sequence 


SEQ ID 
NO • 

of rnnt" 1 cr 

peptide 
sequence 


Priority 
docket number_ 
v» <j r r ts a cm iti i ng 
SEO ID NO* in 

priority 
application 


SEQ ID 
NO : in 

T7 C G M 


60 


1846 


3632 


5418 


784CIP2 60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1848 


3634 


5420 


784CtP2 62 


6499 


63 


1849 


362S 


5421 


784CIP2 63 


6499 


64 


1850 


3636 


5422 


784CIP2_64 


6505 


65 


1851 


3637 


5423 


784CIP2 65 


6534 


66 


1852 


3638 


5424 


784CIP2 66 


6534 


67 


1853 


3639 


542* 


784CIP2 67 


6540 


68 


1854 


3640 


5426 


784C1P2_68 


6550 


69 


1855 


3641 


5427 


784CIP2 69 


6550 


70 


1856 


3642 


5428 


784CIP2 70 


(>592 


71 


1857 


3643 


1 5429 


784CIP2 71 


6645 


72 


1358 


3644 


5430 


784CIP2 72 


CLC 71 
DO f X 


73 


1959 


3645 


5431 


784CTP? 7~K 


O (OJ 


74 


1860 


3646 


i 5432 




6763 


75 


1361 


3647 


5433 


784CIP2 7^ 


6786 


76 


1862 


! 3648 


5434 


784CTP? 7G 


6824 


77 


1B63 


3649 


5435 


7S4C1P? 77 


coin 


78 


1864 


3650 


543 6 


784r*TPP 7fl 
/ L-i to 


6831 


79 


1865 


3651 


54 3 7 


/ 0 ** v» j, Jr »> /y 


con 


80 


1866 


3652 


5438 


784CIP2 8fl 


6 8 34 


ei 


1867 


3653 


54 3 9 


/ OHvifi OX 


6834 


82 


1858 


3654 


544 0 


784C , TP2 ftp 




83 


1869 


3655 


5441 


784C?iP2 fi"? 

9 O Vw ± Er JL O J 


OOJ * 


84 


1870 


3656* 


" 54 4*2 


784CTP2 R4 


6843 


85 


1871 


3657 


5443 


7A4PTP? flt? 


6859 


86 


1872 


3658 


5444 


784CIP2 flfi 


6915 


87 


1873 


3659 


5445 


7B4fTP9 R7 


cQ^j 


88 


1874 


3660 


5446 


7B4PTP2 ftfl 


C. QC'l 


89 


1875 


! 36^1 


5447 


784CIP9 ftc» 


gQCi 

o9o 1 


90 


1876 


3662 


5448 




6973 


91 


1877 


3663 


5449 




Oi? / J 


92 


1878 


3664 


5450 


734^lJp2 93 




93 


1879 


3665 


5451 


7B4CIP2 94 


7018 


94 - 


1880 


3666 


5452 


784CIP2 95 


/UXjf 


95 


1881 


3667 


5453 


784CIP2 9£ 


7020 


96 


1882 


3668 


5454 


784CIP2 97 


7020 


97 


1883 


36S"9 


5455 


784CIP2 98 


7021 


98 


1884 


3670 


5456 


784CIP2 99 


7023 


99 


1885 


3671 


5457 


784CIP2 100 


7027 


100 


1886 


3672 


5458 


784CIP2 101 


7028 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


7031 


103 


1889 


3675 


5461 


784CIP2 104 


7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


105 


1891 


3677 


5463 


784CIP2 10£ 


7035 


106 


1892 


3678 


5464 


784CIP2 107 


7036 


107 


1893 


3679 


5465 


784CIP2 108 


703 9 


108 


1894 


3680 


5466 


784CIP2 109 


7043 


109 


1895 


3681 


5467 


784CIP2 110 


7044 


110 


1896" 


3682 


5468 


784CIP2 111 


7046 


111 


1897 


3683 


5469 


784CIP2 112 i 


7054 


112 


1898 


3684 


54 70 


784CIP2 113 


7061 


113 


1899 


36B5 


5471 


784CIP2 114 


7077 


114 


" " 1900 


368£ 


5472 


784CIP2 115 


7092 


115 


1901 


3687 


5473 


784CIP2_116 


7094 ~ 


116 


1902 


3688 


5474 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


" 5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 


784CIP2 120 


7123 


120 


1906 


3692 


5478 


784CIP2 121 


7142 


121 


1907 


3693 


5479 


'784CIP2 122 


7142 
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SEQ tD NO- 
of full- 
length 
nucleotide 
sequence 


£3E\l ±D 

NO : of 
full- 
length 
peptide 
sequence 


J. D NU • 
mif*l ^fth i 

sequence 


SEQ ID 

ox concig 

peptide 

sequence 


Priority 
docket nutnber_ 
corresponding 
SEO ID NO • in 

priority 
application 


SEQ ID 
NO : in 
U, S . S . N . 

OQ /4RR lOtZ. 

v y / too i 1 AD 


122 


1908 


3694 


5480 


784CIP2_JL23 


7154 


123 


1909 


3695 


5481 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


7 84CIP2_125 


7169 


125 


1911 


3697 


5483 


784CIP2_126 


7185 


126 


1912 


3698 


5484 


784CIP2 127 


7197 


127 


1913 


3699 


5485 


7G4CIP2 128 


7219 


128 


1914 


3700 


5486 


784CIP2 129 


7226 


129 


1915 


3701 


5487 


784CIP2 130 


7229 


130 


1916 


3702 


5488 


784CIP2 131 


7234 


131 


1917 


3703 


5489 


784CIP2 132 




132 


1918 


3704 


5490 


784CIP2 133 


723 5 


133 


1919 


3705 


5491 


784CIP2 134 


1 72 3 8 


134 


1920 


3706 


5492 


784CIP7 13 5 


/a t / 


13 S 


1921 


3707 


5493 


784CIP5 iig 


77 <: t 


136 


1922 


3708 


5494 


784CIP? 1^7 


77C7 


13 7 


1923 


3709 


5495 


784CIP2 13 8 


77 <C 7 


138 


1924 


3710 


54 96 


784CIP2 119 


f 4 f 4 


139 


1925 


3711 


5497 


784CIP7 140 


72 73 


140 


1926 


3712 


5498 


(Ci^ir c J. *t X 


72 82 


141 


1927 


3713 


5499 




728 8 


142 


1928 


3714 


5500" 




/<7l 


143 


1929 


3715 


5501 


7ftiir , TD7 1/4 


7293 


144 


1930 


3716 


5502 


f OH Lire J. ^ 3 


7294 


14S 


1931 


3717 


5503 




7299 


146 


1932 


3718 


5504 


7fl4PrP? 14 7 
» OSLIr ^ X "i I 


73 0 0 


147 


1933 


3719 


£505" ' " 


/04LJ,r4 XIO 


7312 


14 8 


1934 


3720 


5506 




7313 


149 


1935 


3721 


■'" 5507""" " 


/OftLlr^ XDU 


7315 


150 


1936 


j 3722 


55C8 


7R4PTP9 1 
/o**v_xf*i 131 


7318 


151 


1937 


3723 


5509 


704CTP7 ito -'■ 
/D4V.lr^ ±04Z 


7321 


152 


1938 


3724 


" 5510 


784CTP2 151 


/JJU 


153 


1939 


3725 


5511 


784CIP7 154 


70-j T 
'JJl 


154 


1940 


3726 


5512 


784CIPP 1 55 


7^7 7 


155 


1941 


3727 


5513 


784CIP2 15"S 


77'cn 


156 


1942 


3728 


5514 


784CIP7 157 


77 C "5 


157 


1943 


3729 


5515 


7B4TIP7 1 5fl 




158 


1944 


3730 


5516 


784CIP7 159 


74(1 -a 


159 


1945 


3731 


5517 


784CIP2 160 


743 1 


160 


1946 


3 732 


5518 


784CIP2 161 


7441 


161. 


1947 


3733 


5519 


784CIP2 162 


74 53 


162 


1948 


3734 


5520 


784CIP2 163' 


^467 


163 


1949 


3735 


5521 


784CIP2 164 


7471 


164 


1950 


3736 


5522 


784CIP2 165 


7493 


165 


1951 


3737 


5523 


784CIP2 166 


7502 


166 


1952 


3738 


5524 


784CIP2 167 


7511 


167 


1953 


3739 1 


5525 


784CIP2 16^8 


7514 


168 


1954 | 


3740 


5526 


784CIP2 169 


7520 


169 


1955 


3741 


5527 


784CIP2 170 


7541 


170 


1956 


3742 


5528 


784CIP2 171 


7570 


171 


1957 


3743 


5529 


784CIP2 172 


7578 


172 


1958 


3744 


5530 


784C1P2 173 


7583 


173 


1959 


3745 


5531 


784CTP2 174 


7592 


174 


1960 


3745 


5532 


784CIP2_175 


7601 


175 


1961 


3747 


5533 


784CIP2_176 


7602 


■" 176 


1962 


3748 


5534 


784CIP2_177 


7608 


177 


1963 


3749 


5535 


784CIP2 178 


7615 


178 


1964 


3750 


5536 


784CIP2 179 


7617 


179 


1965 


3751 


5537 


784CIP2_181 


7624 


180 


1966 


3752 


5538 


784CIP2_182 


762£ 


181 


"1967' 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


784CIP2 184 


7641 


183 


1969 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


.SEQ ID 


ot run- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


lenyun 


run- 


nucleotide 


of contig 


corresponding 


U.S.S .N. 


UUCXCUUIOC 


lengcn 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


S6QU6I1C6 


npn h i Ho 




sequence 


priority 






At* G 






appiicaciou 




184 


1970 


3756 




/ O^UlrZ^ log 


/ 64 1 


185 


1971 


3757 


ceV-i 


*7 SA^TD"} 1 ft*7 
t 0*t\m.Xir4 ICS/ 


7642 


186 


1972 


3758 


1 CCA A 


/ O^il-li'^I^J. 0 0 


n'c.A 0 


187 


1973 


375*9 




'0^1*1^2 107 


7656 


188 


1974 


3760 


554 6 


/ a 41.1 lr 2 17 \J 


7657 


189 


1975 


1 *7 T 
J /OA 


5547 


/ 0 4 C1P2 > __1 7 1 


7657 


190 


1976 


1*7 CL*> 
3 i Ox 


554 8 


"7 O f- 11 —5 1 O T 

f a 4CIP2 - 192 


7662 


191 


1977 


37^3 


rc/Q 


/04Uir2 17J 


7668 


152 


1978 


3 764 


CCCA 


/84CJLP2 174 


7673 


193 


1979 


3765 


ccci 
3331 




! 7690 


194 


i or n 


■5 (DO 


5552 


784CIP2 196 


7700 


195 


1981 


A / b / 


5553 


7 84CIP2_197 


7709 


196 


1982 


j / b 0 


5554 


784CIP2 198 


7736 


197 




J /07 


5555 


7 B4CIP2_199 


7737 


198 


1 JO 1 ! 


3770 


5556 


7B4CIP2_200 


7744 


199 


1985 


J / / 1 


5557 


784CIP2 201 


7771 


200 


tone 


3772 


5558 


784CIP2_202 


7786 


«vl 


1987 


3773 


5559 


| 784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


l a a a 


3775 


5561 


784CIP2_205 


7806 


204 




3776 


5562 


784CIP2 206 


7812 




1991 


3777 


5563 


784CIP2_207 


7812 


206 


T Q Q "> 


3778 


55 6 4 


784CIP2 208 


7818 


207 


i 1 QQ) 
137J 


3779 


5565 


784CIP2_20 9 


7822 


208 




3780 


5566 


784CIP2__210 


7827 


209 


! 1 QQC 
1773 


3781 


5567 


784CIP2_ 211 


7830 


210 


1 77t> 


3782 


5568 


784CIP2__212 


7835 




1997 


3783 


5569 


784CIP2 214 


7840 


5v5 


T QQQ 

1 770 


3784 


5570 


784CIP2_215 


7858 "~ 


0-1 -1 

/1J 


1999 


3785 


5571 


784CIP2_216 


7858 




2000 


3786 


5572 


784CIP2 217 


7861 


215 


z u u 1 


3787 


5573 


784CIP2_218 


7866 




2002 


3788 


5574 


784CIP2_219 


7868 


217 


2003 


3789 


5575 


784CIP2_220 


7896 


218 


2 0 04 


3790 


5576 


784CIP2_221 


7898 


219 


2005 


3791 


5577 


784CIP2_222 


7900 


220 


2006 


1*700 


557 8 


784CIP2_223 


7906 


221 


2007 


7 /7J 


5579 


784CIP2 224 


7908 


222 


2008 




C CO A 


to ji ntno one 
7B4CXP2 225 


7909 


223 


2009 


3795 




/o*l Ci P2 2 2 O 


7917 


224 


2010 


3796 


5582 


TP^PTDI *}*>*? 


7932 


225 


2011 


3797 




/o4Llrz 228 


7940 


226 


2012 


3798 


5584 


/84l*Xi'2 — 22 7 


794 0 


227 


2013 


3799 


5585 


/ a 4 V- 1 Jr 2 ^ J O 


7984 


228 


2014 


3800 


CCQC 
JJOD 


/o4v-lir2__2Jl 


7984 


229 


2015 


3 801 


" 5587 


"7 0 vi i^t 00 on 


8001 


! 230 


2016 


3 802 


5588 


oil 

/o*ll,l f2 2 J J 


8021 


231 


2017 


3803 


5589 




8029 


232 


2018 


3 804 


5590 


/ o4Ll F2 2 J P 


8033 


233 


2 019 


3805 


5591 


/o4CIP2_2 3 6 


8040 


'"■ 234 


2626 


3806 




/84CXP2 2J7 


8052 


235 


2021 


3807 


CCQ1 


7B4CIP2 238 


8096 | 


236 


2022 


3808 


5594 


784CIP2 239 


8096 


237 


2023 


3809 


5595 


784CIP2_24 0 


8113 


238 


2024 


3810 


5596 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2 242 


8132 


240 


2026 


3812 


5598 


784CIP2 243 


8137 


241 


2027 


3813 


5599 


784CIP2 244 


8137 


242 


2028 


3814 


5600 


784CIP2_245 


8159 


243 : 


2029 


3815 


5601 


784CIP2 246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2 248 


8176 
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SEQ ID NO : 

OX IU11- 

X 6x15 til 
nucleofc i flp 
sequence 


SEQ ID 
NO : of 

IU1 A — 

sequence 


SEQ ID NO: 
of con tig 
nucleot ide 
sequence 


SBQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO : in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488 , 725 


246 


2032 


3 818 


n c n a 




8196 


247 


2 033 


3 819 


5605"" 


/ 0 'i l_i.P2_.« 3U 


B2UU 


248 


2034 


3820 


5^06 ' 




QO T 0 


249 


2035 


3 821 


' 5607 




8220 


2S0 


2036 


3822 


5608 


OOAOTTiO OKI 
IO l ±\~±.trZ ZOO 


823 8 


251 


2037 


3823 


5609 


7fidPT U'J OCA 


8254 


252 


2038 


3824 


5610 


Jr ^ Z33 


8255 


253 


2039 


3825 


5611 


/e^^JL P 2 Zoo 


8288 


254 


2040 


3826 


5612 




8296 


25S 


2041 


3827 


1 Ren 


/84CIP2 2 SB 


6329 


256 


2042 


3828 


5614 


/o4t_IP2 259 


8362 


257 


2 04 3 


3829 


bb JL r> 


784CIP2 260 


8429 


258 


2044 


JO JU 


30 J. b 


764CIP2 261 


8436 


259 


204 5 


3831 


5617 


784CIP2_262 


8448 


260 


2046 




5618 


784CIP2__263 


8472 


261 


2 047 


O Q O O 
JO JJ 


5619 


784CIP2 264 


8502 


262 




3834 


5620 


784CIP2_265 


8504 


263 


2 04 9 


3835 


562 1 


784CIP2 266 


8507 


264 


0 n cn 


3836 


5622 


784CIP2 268 


8509 


265 " 


2 051 


3 83 7 


5623 


784CIP2 26 9 


8515 


266 


2 052 


J 0 J 0 


5624 


784CIP2_270 


8519 


267 


2 053 


3 839 


5625 


784CIP2_271 


8530 


268 


2054 


3 840 


5626 


784CIP2 272 


8532 


269 


2055 


3841 ' 




784CIP2 273 


8532 


270 


2056 


3842 


5628 


784CIP2 274 


8539 


271 ' 


TftC'I 
*U3 / 


3843 


5629 


784CIP2_275 


8541 


272 


0 n en 


3844 


5630 


784CIP2 276 


854 3 


273 




3845 


5631 


784CIP2 277 


8593 


274 


2 oTTo 


J B4 b 


5632 


784CIP2 278 


8595 


275 


onci 


3 847 


5633 


784CIP2 279 


8615 


276 


£. UO 4, 


3 84 8 


5634 


784CIP2 280 


8620 


277 


2063 


JO? J 


DO J b 


784CIP2_281 


8621 


278 


2064 




" c cV? 

bbJb 


784CIP2_282 


8623 


279 


2065 


3 851 


btn -7 

5637 


784CIP2 283 [ 


8625 


280 


2066 


3 852 


cc-io " 


784CIP2 284 


8628 


281 


2067 


3 853 




784CIP2 285 


8628 


282 


2068 


" ■ ■"3 B54 


564 0 


/84CIP2 286 


8629 


283 


2069 


3 8 55 


564 1 


784CIP2 287 


8630 


284 


2070 


3 856 




7B4CIP2 288 


8631 


285 


2071 


3 857 




7S4CIP2 289 


8633 


286 


2072 


3858 


CCA A 


/Q4LIP2 290 


8 6*34 


287 


2073 


3 859 




291 


8635 


288 


2074 


3860 


3010 


TO/ T Y*> •"> ft o f 

/o 4 *uIP2 292 


8636 


289 


2075 


3 861 


5647 


/o*i\-±}?Z 293 


8659 


290 


2076 


3862 


5 64 8 




8660 


291 


2077 


3863 


5649 


TP/IPTDO OOC 


8667 


292 


2078 


3864 


5650 




8667 


293 


2079 


3865 


5651 


7A/1PTD9 OOO 


8685 


294 


2080 


3866 


5652 


OPAOTDO OOQ 


8805 


295 


2081 


3 867 


5653 


/of4UXP2 299 


8896 


296 


2082 


3666 




7A4PTD') 

/o<*v-IP2 300 | 


8978 


297 


2083 


3 869 


5655 


/o^UJLP2 Jul 


9046 


298 


2084 


3870 


5656 


784CIP2_302 


9048 


299 


2085 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


9195 | 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


302 


2088 


3874 


5660 


784CIP2_306 


9307 


303 


2089 


3875 


5661 


784CIP2 307 


9321 


304 


2090 


3876 


5662 


7B4CIP2_308 


93 97 


305 


2091 ! 


3877 


5663 


784CIP2 309 


9405 


306 


2092 


3878 


5664 


784CIP2 310 


9406 


307 


2093 


3879 


5665 


784CIP2 311 


9422 



275 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 


SEQ ID 
NO : of 
full- 
length ' 
peptide 
ocyuenct; 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


308 


2 0 94 


3880 ""■ 




784CIP2_312 


9494 


3 09 


9 nqt; 


JOOl 


«CJ 


784CIP2 313 


9512 


310 


2 096 


3 882 


\ DbOO 


784CIP2 314 


9632 


311 


2 097 


3883 


CCCQ 


784CIP2 315 


9661 


312 


2098 


3884 


JO / U 


764CIP2_316 


9664 


313 


2 099 


3885 




784CIP2_31 7 


9691 


314 




3886 


5672 


* 784CIP2 318 ! 9700 


315 




Jot) / 


do / J 


784CIP2 319 


9716 


316 


21 02 




5674 


784CIP2_ 320 


9721 


•a -\ t 
JX 1 


■£ J- UJ 




5675 


784CIP2 321 


9870 




9 1 HA 


3890 


c 5 =7? 

56 /b 


784CIP2 322 


9887 




/ill? 


3891 


5677 


784CIP2 323 


9923 


1 *3 


2106 


3892 


5678 


784CIP2 324 


9938 


111 
J^x 


2107 


3893 


5679 


784CIP2_325 


9964 


322 


2108 


3 894 


5680 


784CIP2 326 


10007 


323 


2109 


3895 


5681 


784CIP2 327 


10009 


324 


2110 


3896 


5682 


784CIP2_328 


10046 


*a *5 c 


2111 


3897 


5683 


784CIP2_329 


10156 


326 


2112 


3898 


5684 


784CIP2_330 


10276 


327 


2113 


3899 


568S 


784CIP2 331 


10283 


O ~>0 


2114 


3900 


5686 


784CIP2B 1 


152 


329 


2115 


3901 


5687 


784CIP2B_2 


167 


i *a 


2116 


3902 


5688 


784CIP2B_3 


205 


Tvi 


2117 


3903 


5689 


784CIP2B 4 


210 




2118 


3904 


5690 


784CIP2B_5 


22S 


333 


2119 


3905 


5691 


784CIP2B 6 


226 


334 

— — _ 


2120 


3906 


5692 


I 784CIP2B_7 


264 


335 

— ; , 


2121 


3907 


5693 


784CIP2B 8 


268 


336 


2122 


3908 


5694 


784CIP2B 9 


293 


337 


2123 


3909 


5695 


784CIP2B_10 


293 


338 


2124 


3910 


5696 


784CIP2B_11 


293 


339 




2125 


3911 


5697 


784CIP2B_12 


302 




2126 


3912 


5698 


13 


311 


1 AT 

i o t t± 


2127 


3913 


5699 


TOAOTClOO 1 A 

/b'JLXfiD X4 


352 


342 


2128 


3914 


5700 


/O^Llr^o 15 


TEo 

358 




2129 


3915 


5701 


TBiPTDOP TC 
/O^LlrZD X fa 


368 


344 


213 0 


3916 


5702 


/O^UXJr^o^X / 


3 93 


345 — 




3917 


5703 


/O'iVwXirZD XO 


477 


346 


2132 


1 Q1 Q 


5704 


TQAOTOon t a 

/u^*— ir^o xy 


508 


347 






5705 


784CIP2B_20 * 


508 


348 


2134 


3920 


5 706 


784CIP2B_21 


515" 


349 


2135 




b 707 


784CIP2B 22 


578 


350 


213 6 


1 999 


J fUO 


784CIP2B_23 


588 


351 


2 13 7 


3923 


5709 


784CIP2B 24 


591 


352 


213 8 


J J £ *± 


5710 


784CIP2B 25 


593 


' 353 


213 9 


3925 


9 f XX 


784CIP2B 26 


594 


354 


2140 


3926 


t:*7i "5 
-> / x^ 


784CIP2B_27 


619 


355 


2141 


3927 


9 /XJ 


784CIP2B_28 


620 


356 


2142 


3 92 8 


5714 


784CIP2B_29 


654 


357 


2143 


3 929 


5715 


784CIP2B 30 


692 [ 


358 


2144 


loin 
J jjU 


5716 


784CIP2B 31 


753 


359 


2145 




5717 


784CIP2B 32 


758 


360 


2146 


3932 


5718 


784CIP2B 33 


787 


361 


2147 


3933 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


784CIP2B_35 


638 


363 


2149 


3935 


5721 


784CIP2B 36 


870 


364 


2150 


3936 


5722 


7B4CIP2B 37 


891 


365 


2151 


3937 


5723 


784CIP2B_38 


891 


366 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B 40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


2155- 


3941 


5727 


784CIP2B 42" 


942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


"OV J. \J Ct\J Z 

of contig 

nucleotide 

sequence 


SEQ ID 
NO : 

peptide 
sequence 


Priority 
docket number^ 

r>nv-rp<3nnnrf ■? nrr 
SEQ ID NO: in 
priority 
application 


SEQ ID 
wo: in 
U S . S . N . 
ng/AQa 725 


370 


2156 


3942 


5728 


784CIP2B_43 


958 


371 


2157 


3943 


5729 


784CIP2B_4 4 


968 


372 


2158 


3944 


5730 


784CIP2B_45 


992 


373 


2159 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 


2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


5734 


784CIP2B 4 9 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


I 1144 


378 


2164 


3950 


5736 


784CIP2B 51 


1262 


379 


2165 


3951 


5737 


784CIP2B S2 


13 18 


380 


2166 


3 952 


5738 


784CrP2B 53 


13 19 


381 


2167 


3953 


573 9 


784CIP2B 54 


1328 


382 


2168 


3954 


574 0 ■" 


784CIP2B 55 


143 6 


383 


2169 


3955 


5741 


784CIP2B 56 


1464 


384 


2170 


3956 


5742 


784CIP2B 57 


1584 


385 


2171 


3957 


574 3 


784CIP2B 58 


1617 


38* 


2172 


3958 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


574 5 


784CIP2B 60 


1728 


368 


2174 


3960 


574 6 


784CIP2B 61 


1772 


| 389 


2175 


3961 


574 7 


784CIP2B 


180 9 "" " 


390 


2176 


3962 


5748 


784CIP2B 63 


1 ACQ 


391 


2177 


3963 


574 9 




18 98 


392 


2178 


3964 


5750 


784CIP2B 65 


13 £ O 


393 


2179 


3965 


5751 


784CIP2B fifi 




394 


2180 


3966 


5752 


784CIP2B 67 


1 QC7 

isb / 


1 395 


2181 


3967 


5753 


784CIP2r 6« 


1995 


396 


2182 


3968 


5754 


784CIP2R fi9 




397 


2183 


[ 3969 


5755 


784"riP5R 70, 


dZVZ f 


j 398 


2184 


3 970 


5756 


7B4CIP2B 71 


U33 


399 


2185 


3971 


S7£i7 " 


784CIP2B 7? 


. 51 AT 

*1UJ 


400 


2186 


3972 


5758 


784CIP2R 71 


«iud 


401 


2187 


3973 


5759 


784CIP2B 74 


£1DD 


402 


2188 


3974 


5760 


784CIP2B 75 


2175 


403 


2189 


3975 


5761 


784CIP2B 76 


2176 


404 


2190 


3976 


5762 


7B4C~P?B *7fl 




405 


2191 


3977 


5763 


784CIP2B 79 


2250 


406 


2192 


3978 


5764 


784CIP2B 80 


2300 


407 


2193 


3979 


- $7"*5" 


7£4CIP2B 81 


23 23 


408 


2194 


3980 


5766 


784CIP2B 62 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


2371 


410 


2196 


3982 


5768 


784CIP2B 84 


2399 


411 


2197 


3983 


5769 


784CIP2B 85 


2411 


412 


2198 


3984 


5770 


784CIP2B 86 


2428 


413 


2199 


3985 


5771 


784CIP2B 87 


243 0 


414 


2200 


3986 


5772 


784CIP2B_88 


2439 


415 


2201 


3987 


5773 


784CIP2B_89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 


784CIP2B_91 


2487 1 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


5777 


784CIP2B £3 


2512 


| 420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 


423 


2209 


3995 


5781 


784CIP2B_97 


2818 


424 


2210 


399£ 


5782 


784CIP2B 98 


2819 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CIP2B_100 


3137 


427 


2213 


3999 


5785 


784CIP2B_101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B 104 


3360 


L 431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


of con h i a 

nucleotide 
sequence 


NO : 

of con tig 

peptide 

sequence 


corresponding 
SEQ ID NO: in 
pri ority 
application 


SEQ ID 
jnu : in 
U. S .S .N. 
09/488, 725 


432 


2218 


4004 


5790 


784CIP2B 106 


3417 


433 


2219 


1 4005 


5791 


784CIP2B_107 


3418 


434 


2220 


4006 


5792 


784CIP2B__108 


3442 


435 


2221 


4007 


5793 


784CIP2B 109 


3442 


43* 


2222 


4008 


j 5794 


784CIP2B_110 


3444 


437 


2223 


4009 


| 5795 


784CIP2B_111 


3855 


438 


2224 


4010 


5796 


784CIP2B_112 


3863 


439 


2225 


4011 


5797 


784CIP2B_113 


4090 


440 


2226 


4012 


5798 


784CIP2B 114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B_115 


4142 


442 


2228 


4014 


5800 


784CIP2B 116 


4142 


443 


2229 


4015 


5801 


784CIP2B 117 


4149 


444 


2230 


| 4016 


5802 


784CIP2B 118 


1 4196 


445 


2231 


4017 


| 5603 


784CIP2B 119 


4202 


446 


2232 


4018 


5804 


784CIP2B 120 


4274 


447 


2233 


4019 


j 5805 


784CIP2B 121 


4304 


448 


2234 


4020 


5806 


784CIP2B 122 


4306 


449 


2235 


4021 


| 5807 


784CIP2B 123 


4311 


450 


2236 


4 02 2 


5808 


784CIP2B 124 


4321 


1 451 


2237 


4023 


5809 


784CIP2B 125 


4323 


452 


2238 


4024 


5810 


784CIP2B 126 


4332 


453 


2239 


4025 


5811 


784CIP2B 127 


4488 


454 


2240 


4026 


5812 


784CIP2B 12 8 


4586 


455 


2241 


4027 


5813 


784CIP2B 129 


5569 


456 


2242 


4028 


5814 


784CIP2B 130 


5573 


457 


2243 


i 4029 


5815 


784CIP2B 131 


5577 


458 


2244 


| 4030 


5816 


784CIP2B 132 


5579 


459 


2245 


4031 


5817 


784CIP2B 133 


55 82 


460 


2246 


4032 


5818 


784CIP2B 134 


5583 


461 


2247 


4033 


5819 


784CIP2B" 135 


5584 


462 


2248 


4034 


5820 


784CIP2B 136 


5585 


463 


2249 


4035 


5821 


784CIP2B 137 


5591 


464 1 " 


2250 


4036 


5822 


784CIP2B 138 


5593 


465 


2251 


4037 


5823 


784CIP2B 139 


5594 


" 4*6 


2252 


4038 


5824 


784CIP2B 140 


5594 


467 


2253 


4039 


5825 


784CIP2B_141 


5598 


468 


2254 


4040 


5826 


784CIP2B 142 


56 02 


469 


2255 


4041 


5827 


784CIP2B_143 


5605 


470 


2256 


4042 


5828 


784CIP2B_144 


5608 


471 


2257 


4043 


5829 


784CIP2B 145 


5617 


i 472 


2258 


4044 


583 0 


784CIP2B_14* 


5620 


473 


2259 


4045 


5831 


784CIP2B 147 


5622 


474 


2260 


4046 


5832 


784CTP2B_14 8 


5623 


475 


2261 


4047 


5833 


784CIP2B 149 


5624 


476 


2262 


4048 


5834 


784CIP2B_150 


5625 


477 


2263 


4049 


5835 


784CIP2B 151 


S627 


478 


2264 


4050 


5836 


784CIP2B_1S2 


5628 


479 


• 2265 


4051 


5837 


784CIP2B_153 


5630 


480 


2266 


4052 


5838 


784CIP2B 154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 


4 054 


5840 


784CIP2B 1^6 


5641 


483 


2269 " ■ 


4055 


5841 


784CIP2B 157 


5643 ; 


484 


2270 


4056 


5842 


784CIP2B_158 


5647 


485 


2271 


4057 


5843 


784CIP2B 159 


5649 


486 


2272 


4058 


5844 


784CIP2B_166 


5658 


487 


2273 ' 


4059 


5845 


784CIP2B 161 


5659 


488 


2274 


4060 


5846 


784CIP2B__162 


5667 


469 


2275 


4061 


5847 


784CIP2B 163 


5672 


490 


2276 


4062 


5848 


784CIP2B 164 


5674 


_491 


2277 


4063 


5849 


784CIP2B 165 


5*78 


492 


2278 


4064 


5BS0 


784CIP2B 166 


5680 


493 


2279 


4065 


5851 


7B4CIP2B 167 


5684 
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SEO ID NO • 
of full- 
length 
nucleotide 
sequence 


CPA TT) 

NO : of 
full- 
length 
peptide 
sequence 


x u lNvj ; 
o£ ccint- i tn 

nucleotide 
sequence 


SBQ ID 

peptide 
sequence 


Priority 
docket nuinber^ 
concapuiiuiny 
SBQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U S S N . 
09/488 775 


494 


2280 


4066 


5852 


784CIP2B_168 


5686 


495 


2281 


4067 


5853 


784CIP2B 169 


5694 


496 


2282 


4068 


5854 


784CIP2BJL70 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


496 


2284 


4070 


5856 


784CIP2B172 


5712 


499 


2285 


4071 


5857 


784CIP2B_173 


5719 


500 


2286 


4072 


| 5858 


784CIP2B_174 


1 S720 


501 


2287 


4073 


5859 


784CIP2B_175 


5727 


502 


2288 


4074 


5860 


784CIP2B 176 


5730 


503 


2289 


| 4075 


5861 


784CIP2B 177 


5734 


504 


2290 


i 4076 


5862 


784CIP2B 178 


573 8 


505 


2291 


[ 4077 


5863 


784CIP2B 179 


5739 


506 


2292 


4078 


5664 


784CIP2B 180 


5740 


507 


2293 


4079 


5865 


784CIP2B 181 


5744 


508 


2294 


4080 


5866 


784CIP2B 182 


5748 


509 


2295 


4081 


5867 


784CIP2B 183 


5749 


510 


2296 


4082 


5868 


784CIP2B 184 


5750 


511 


2297 


4083 


5869 


7B4CIP2B 185 


5750 


512 


2298 


| 4084 


5870 


7B4CIP2B 18S 


5750 


513 


2299 


4085 


5871 


784CIP2B 187 


5761 


514 


2300 


4086 


5872 


784CIP2B 189 


5762 


515 


2301 


4087 


5873 


784CTP2B Iflq 


o / o / 


516 


2302 


4088 


5874 


7B4CIP2B 190 


5773 " 


517 


2303 


4089 


5875 


784CIP2B 191 


57 83 


518 


2304 


4090 ■ 


5876 


7B4CIP2B 192 ~ 


C7B4 

SIB* 


519 


2305 


4091 


5877 




a / o o 


520 


2306 


4092 


S878 


784C*TP?R IQd 


o / yo 


521 


2307 


4093 


5879 


784CIP2B 196 




522 


2308 


4094 | 5880 


784CIP2B 19*7 


cot a 


523 


2309 


4095 


5881 


784CIP2B 198 


"5819 


524 


2310 


4096 


5882 


784CIP2B 199 


5827 


525 


2311 


4097 


5883 


784CIP2B 200 


5828 


526 


2312 


4098 


5884 


784CiP2B 201 


5842 


527 


2313 


4099 


5885 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


784CIP2B 203 


5861 


529 


2315 


4101 


5887 


784CIP2B 204 


5864 


530 


2316 


4102 


5888 


784CIP2B 205 


5865 


531 


2317 


4103 


5889 


784CIP2B 206 


58 71 


532 


2318 


4104 


5890 


784CIP2B 207 


58 73 


533 


2319 


4105 


5891 


784CIP2B 208 


5873 


534 


2320 


4106 


5892 


784CIP2B 209 


5875 


535 


2321 


4107 


5893 


7 84CIP2B_210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


53 7 


2323 


4109 


5895 


784CIP2B 212 


5880 


53B 


2324 


4110 


5896 


784CIP2B 213 


5880 


539 


2325 


4111 


5897 


784CIP2B 214 


5880 


540 


2326 


4112 | 


5898 


7 84CIP2B_215 


5880 


541 


2327 


4113 


5B99 


7 84CIP2B_216 


5B85 


542 


2328 


4114 


5900 


784CIP2B 217 


5895 


543 


2329 


4115 


5901 


784CIP2B 218 


5B98 


544 


2330 


4116 


5902 


784CIP2B 219 


5902 


545 


" 2331 


4117 


S903 


784CIP2B 220 


5904 


546 


2332 


4118 


5904 


784CIP2B_221 


5918 


547 


2333 


4119 


5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


~ "2335 


4121 


5907 


784CIP2B 224 


5932 


550 


2336 


4122 


5908 


784CIP2B 225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B_227 


594* 


553 - 


2339 


4125 


5911 


784CIP2B 228 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 


5956 


555 | 2341 


4127 


5913 


784CIP2B 230 


59S7 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 

• 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket nuraber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


ccc 
99Q 


234 2 


4126 


5914 


7 B4CIP2B_232 


5975 


S5"7 


2343 


4129 


5915 


7 84CIP2B__233 


5977 


558 


2344 


4130 


5916 


784CIP2B 234 


5978 




2345 


4131 


5917 


784CIP2B__235 


5979 


560 


2346 


4132 


5918 


784CIP2B_236 


5980 


561 


234 7 


4133 


5919 


784CIP2B_237 


5988 


EZi 


2348 


4134 


5920 


7 84CIP2B_23S 


5989 


bo J 


2349 


4135 


5521 


784CIP2B^239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 


565 


2351 


4 137 


5923 


784CIP2B_241 


5998 


566 


23 52 


4138 


5924 


784CIP2B 242 


6003 


567 


23 53 


4139 


5925 


784CIP2B__243 


6004 


568 


2354 


4140 


5926 


784CIP2B_244 


6013 


569 


2355 


4141 


5927 


784CIP2B__245 


6028 


570 


2356 


4142 


5928 


784CIP2B 246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


784CIP2B_249 


6031 j 


574 


2360 


4146 


5932 


784CIP2B 250 


6032 


575 


2361 


4147 


5933 


784CIP2B_251 


6037 


576 


2362 


4148 


5934 


784CIP2B_252 


6037 


577 


2363 


4149 


5935 


7 84CIP2B_2S3 


6043 


578 


2364 


4150 


5936 


784CIP2B 254 


6044 


579 


2365 


4151 


5937 


784CIP2B_255 


6046 


can 

580 


2366 


4152 


5938 


784CIP2B 256 


6048 


581 


2367 


4153 


5939 


784CIP2B_257 


6049 


582 


2368 


4154 


5940 


784CIP2B 258 


SoSi 


583 


2369 


4155 


5941 


784CIP2B_259 


6053 


584 


2370 


4156 


5942 


784CIP2B_26 0 


6060 


585 


2371 


4157 


5943 


784CIP2B 261 


6063 


586 


2372 


4158 


5944 


784CIP2B 262 


6066 


587 


2373 


4159 


5945 


784CIP2B_263 


6067 


588 


2374 


4160 


5946 


784CIP2B_264 


6068 


589 


2375 


4161 


5947 


784CIP2B 265 


6073 | 


590 


23 76 


4162 


5948 


784CIP2B_266 


6076 


591 


2377 


4163 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B 268 


6077 


COT 


2379 


4165 


5951 


784CIP2B 269 


6079 


594 


23 80 


4166 


5952 


784CIP2B 270 


6082 




23 81 


4167 


5953 


784CIP2B 2 72 


6088 


Corf 


23 82 


4168 


5954 


784CIP2B 273 


6091 


CO*7 i 


23 83 


4169 


5955 


784CIP2B_274 


6094 


CO ft 


23 8 4 


4170 


5956 


784CIP2B_275 


6101 




23 8 5 


4171 


5957 


784CIP2B_276 


6103 


600 


*s J o b 


4 1 72 


5958 


784CIP2B^277 


6104 




23 87 


4173 


5959 


784CIP2B_278 


6108 


602 


^ o o 0 


4174 


5960 


784CIP2B_279 


6112 




*z j o y 


4175 


5961 


784CIP2B_280 


6121 


604 


«J 7 u 


4176 


5962 


784CIP2B 281 


6125 


"605 


J. 


4177 


5963 


784CIP2B 282 


6126 


606 




4 178 


5964 


784CIP2B_283 


6128 


ou / 


23 93 


4179 


5965 


784CIP2B_284 


6129 


608 * 


2394 


4180 


5966 




6133 


609 


2395 


4181 


5967 


784CIP2B_286 


6133 


610 


2396 


4162 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 1 


784CIP2BJ288 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


" " 613 


2399 


4185 


5971 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5973 


" 784CIP2B 292 


6148 


616 


2402 


4188 


5974 


784CIP2B_293 


6149 


617 


2403 . " 


4189 


5975 


7B4CIP2B 294 


6149 
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SEQ ID NO; 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


of contig 

nucleotide 

sequence 


SEQ ID 
NO • 

of rnn H n 

peptide 
sequence 


Priority 

do eke t hunine r_ 

corre spending 

SEO ID NO - in 

priority 

application 


SEQ ID 
NO : in 

TT C O XT 
QQ/4RB 79C. 


j 618 


2404 


4190 


5976 


784CIP2B_295 


6153 


! 619 


2405 


4191 


5977 


784CIP2B_296 


6159 


620 


240<S 


4192 


5978 


784CIP2B_297 


6164 


621 


2407 


4193 


5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B 300 


6173 


624 


2410 


4196 


5982 


784CIP2B_3 01 


6190 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4198 


5984 


784CIP2B_303 


6196 


627 


2413 


4199 


5985 


784CIP2B 304 


6197 


628 


2414 


4200 


5986 


7B4CIP2B 3 05 


6198 


629 


2415 


4201 


5987 


784CIP2B 306 


6198 


630 


2416 


4202 


5988 


784CIP2B 308 


6214 


631 


2417 


4203 


5989 


784CIP2B 309 


5215 


632 


2418 


4204 


5990 


784CIP2B 310 


6219 


633 


2419 


4205 


5991 


784CIP2B 311 


6226 


634 


2420 


4206 


5992 


784CIP2B 312 


6229 


635 


2421 


4207 


5993 


78 4CIP2B 313 


6234 


636 


2422 


4208 


5994 


784CIP2B 314 


6237 


637 


2423 


4209 


5995 


784CIP2B 315 




638 


2424 


4210 


5996 


784CIP2B 316 


6239 


639 


2425 


4211 


5997 


784CIP2B 317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


7A4CIP9R 119 




642 


2428 1 ■ 


4214 


6000 






643 


2429 


4215 


6001 


r 7B4PTP9n 191 


6245 


644 


2430 


4216 


6002 


•7fl4CTP?n 199 




645 


2431 


4217 


6003 


7fl4PTP9H 191 


coco 


646 


2432 


4218 


6004 




r 0 CO 


6-47 


2433 


4229 


6005 







648 


2434 


4220 


6006 


784CIP2T* 19fi 


£9*n 
ozou 


649 


2435 


4221 | 6007 


784CIP2B 397 


O £. 0 i 


650 


5436 


4222 


£008 


784CIP2B 328 


6264 


651 


2437 


4223 


6009 


784CIP2B 329 


6265 


652 


2438 


4224 


6010 


784CIP2B 330 ' 


6266 


653 


2439 


4225 


6011 


784CIP2B 331 


6270 


654 


244 0 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


422? 


6013 


784.CIP2B 334 


6274 


656 


2442 


4228 


6014 


784CIP2B 335 


6276 


6S7 


2443 


4229 


6015 


784CIP2B 336 


6281 


658 


2444 


4230 


6016 


784CIP2B 337 


6281 


659 


2445 


4231 


6017 


784CIP2B 338 I 


6288 


660 


2446 


4232 


6018 


784CIP2B 339 


6292 


661 


2447 


4233 


6019 


784CIP2B 340 


6294 


662 


2448 


4234 


6020 


784CIP2B_343 


6312 


663 


2449 


4235 


6021 


784CIP2B 344 


6312 


664 


2450 


4236 


6022 


784CIP2B 345 


6312 


665 


2451 


4237 


6023 


784CIP2B_346 


6322 


666 


2452 


4238 


6024 


784CIP2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


6329 


668 


2454 


4240 


6026 


7B4CIP2B 350 


6331 


669 


24S5 


4241 


6027 


784CIP2B 351 


6333 


670 


2456 


4242 


6028 


7B4CIP2B_352 [ 


6334 


671 


2457 


4243 


'" 6029 ' 


784CIP2B 353 


£337 


672 


2458 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B_356 


6348 


675 


2461 


4247 f 


6033 


784CIP2B 3^7 


6348 


67* 


2462 


" 424 8 


6034 


784CIP2B_358 


6350 


677 


2463 


4249 


6 035 


784CIP2B__359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


679 


2465 j 


4251 


6037 


784CIP2B 361 | 6362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEO ID NO- 

of conbig 

nucleotide 

sequence 


SEQ ID 
NO : 

of con t- A a 

peptide 
sequence 


SEQ ID NO • in 

priority 

application 


ShQ ID 

NO: iri 
U. S . S .N 
09/488 77«; 


580 


2466 


4252 


6038 


784CIP2B 362 


6368 


681 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2468 


4254 


6040 


784CIP2B 3^4 


6371 


683 


2469 


4255 


6041 


784CIP2B_36S 


6376 


684 


2470 


~* 42S6 


6042 


784CIP2B_366 


6379 


685 


2471 


4257 


6043 


784CIP2B 367 


6380 


686 


2472 


42S8 


6044 


7 84 CI P2B_3 68 


6381 


687 


2473 


4259 


6045 


784CIP2B 369 


6392 


688 


2474 


4260 


6046 


784CIP2B_3 70 


6395 


689 


2475 


4261 


6047 


784CIP2B 371 


63 97 


690 


2476 


4262 


6048 


784CIP2B 372 


6400 


691 


2477 


4263 


6049 


784CIP2B 373 


6401 


692 


2478 


4264 


6050 


784CIP2B 374 


6411 


693 


2479 


4265 


I 6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


784CIP2B 376 


6411 


695 


2481 


4267 


6053 


784CIP2B 377 


6416 


696 


2482 


4268 


6054 


784CIP2B 378 


6416 


697 


2483 


4269 


6055 


784CIP2B 379 


6422 


698 


2484 


4270 


6056* 


784CIP2B 380 


6423 


699 


2485 


4271 


6057 


784CIP2B 381 


6426 


700 


2486 


4272 


6058 


784CIP2B 3fl2 


i 6427 


701 


2487 


4273 


6059 


784C?TP?R ^fl*? 

' O^lvXf £0 JO J 




f ~ 702 


2438 


4274 * 


6060 


784CIP2B 


64 29 


703 


2489 


4275 


6061 


784CIP2B 385 


6430 


704 


2490 


4276 


6062 


784CiP2B 386 


6432 


705 


2491 


4277 


6063 


784CIP2B **87 




706 


2492 


4278 


6064 


784CIP2B 388 


fid 1 R 


707 


2493 


4279 


6065 


784CIP2B 3HQ 




708 


2494 


4280 


6066 


784CIP9B 3Qfl 


6446 


709 


2495 


4281 


6067 


7B4CIP2B 391 




710 


2496 


4282 


6068 


784CIP2B 39? 


£ 4 e q 


711 


2497 


4283 


6069 


7B4CIP2B 394 


64 61 


L 712 


2498 


4264 


6070 


784CIP2B 395 '"" 




713 


2499 


4285 


6071 


784CIP2B 396 


6468 


714 


2500 


4286 


6072 


784CIP2B 397 


6487 i 


715 


2501 


4287 


6073 


784CIP2B 398 


649 1 


716 


2502 


4288 


6074 


784CIP2B 399 


6S0£ 


717 


2503 


4289 


6075 


784CIP2B 401 


6514 


718 


2504 


4290 


6076 


784CIP2B 402 


6519 


719 


2505 


4291 


6077 


784CIP2B 403 


6521 


720 


2506 


4292 


6078 


784CIP2B_4 04 j 


6532 


721 


2507 


4293 


6079 


784CIP2B_405 


653£ 


722 


2508 


4294 


6080 


784CIP2B 406 


6543 


723 


2509 


4295 


6081 


784CIP2B 407 j 


6544 


724 


2510 


4296 


6082 


784CIP2B_408 


654 8 


725 


2511 


4297 ! 


6083 


784CIP2B 409 


6551 


726 


2512 


4298 


6084 


784CIP2B 410 


6551 


727 


2513 


4299 


6085 


784CIP2B 411 


6552 


728 


2514 


4300 


6086 


784CIP2B 412 


6554 


729 


2515 


4301 


6087 


784CIP2B 413 


6556 


730 


2516 


4302 


6088 


7B4CIP2B 414 


6560 


731 


2517 


4303 


6089 


784CIP2B 415 


6563 


732 


2518 


4304 


6090 


784CIP2B 416 


6564 


733 


2519 


4305 


6091 


784CIP2B 417 


6"567 


734 


2520 


4306 


£092 


784CIP2B_418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B 420 


6577 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 


738 


2524 


4310 


6096 


784CIP2B_422 


6595 


739 


2525 


4311 


6097 


784CIP2B 423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 j 


6625 


741 


2527 


4313 


6099 


784CIP2B 425 | 


6625 
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of full- 
length 
nucleotide 
sequence 


oay ± L) 

NO : of 
full- 
length 
peptide 
sequence 


OT7/^ T T\ M/^ _ 

sequence 


SEQ ID 

NO : 

of contig 


Priority 
docfcet number^ 
corresponding 
con TD NO • in 
orioritv 
appl i ca t i on 


SEQ ID 
NO: in 
U . S . S . N . 


742 


2528 


4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


| 6101 


784CIP2B_427 


6630 


744 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_429 


6'6"32 


746 


2532 


4316 


6104 


784CIP2B 430 


6633 


747 


2533 


4319 


6105 


784CIP2B 431 


6634 


748 


2534 


4320 


6106 


784CIP2B 432 


6638 


749 


2535 


4321 


6107 


784CIP2B 433 


6641 


750 


2536 


4322 


6108 


784CIP2B 434 


6644 


751 


2537 


4323 


6109 


784CIP2B 435 


664 6 


752 


2538 


4324 


6110 


784CIP2B 436 


664 8 


753 


2539 


4325 


6111 


784CIP2B 437 


6652 


754 


2540 


4326 


6112 


784CIP2B 438 


6654 


755 


2541 


4327 


6113 


784CIP2B 




756 


2542 


4328 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


7B4CIP2B 441 


DO D O 


758 


2544 


4330 


6116 


784CIP2B 442 


6664 


•759 


2545 


4331 


6117 


784CIP2B 443 


666 8 


760 


2546 


4332 


6118 


794CIP2B 444 


o O O 7 


761 


2547 


4333 


6119 


784CXP2B 44*5 


6673 


762 


2548 


4334 


6120 


7B4PTP9B 44ft 


OOoD 


763 


2549" 


433» 


6121 


784C , TP9B 447 


OO 0 / 


764 


2550 


4336 


6122 


' o *± J. c. o t i-H a 


A £ o o 
OD07 


765 


2551 


4337 


6123 


7R4PTP^R 44Q 


6 6 93 


76$ 


2552 


4338 


1 6124 




6 6 9 8 


767 


2553 


4339 


6125 


/0«±^J.jr^.D 431 


oo 99 


768 


25S4 


4340 


6126 




cine 


769 


i 2555 


4341 


6127 ' "" ' 




o /ll 


770 


2556 


4342 


6128 ' 




6713 


771 


2557 


4343 


6:129 




© / Id 


772 


2558 


4344 


6130 


784CIP2H 4Rfi 


677*; " 


! 773 


2559 


4345 


6131 


784PIP5R 4«?7 


672 6 


774 


2560 


4346 


6132 


784CIP2B -A^fl 


G70 7 


775 


2561 


4347 


6133 


784CIP2B 4^<3 


ft7in 


776 


2562 


4348 


6134 




O / JU 


777 


2563 


4349 


6135 


784CIP2B 4G1 


D f JU 


778 


2564 


4350 


6136 


784CIP2B 462 


O / J 4 


779 


2565 


4351 


6137 


784CIP2B 463 


6733 


780 


2566 


4352 


6138 


784CIP2B 464 


673 7 


781 


2567 


4353 


6139 


784CIP2B 465 


6745 


782 


2568 


4354 


6140 


784CIP2B 466 


6751 


783 


2569 


4355 


5141 


784CIP2B 467 


6754 


784 


2570 


4356 


6142 


784CIP2B 468 


6758 


785 


2571 


4357 


6143 


784CIP2B 469 


6761 


786 


2572 


4358 


6144 


784CIP2B 470 


6765 


787 


2573 


4359 


6145 


784CIP2B 471 


6768 


788 T 


2574 


4360 


6146 


7 84CIP2B 472 


6773 


789 


2575 


4361 


6147 


784CIP2B 473 


6776 


790 


2576 


4362 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B 475 


6798 


792 


.2578 


4364 


6150 


784CIP2B 476 


f>823 


793 


2579 


4365 


6151 


784CIP2B 477 


6B25 


794 


2580 


4366 


6152 


784CIP2B 478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 j 


6839 


796 


2582 


4368 


6154 


784CIP2B 480 


6844 


797 


2583 


4369 


6155 


784CIP2B 482 


6849 


798 


2*84 


4370 


6156 


784CIP2B_4 83 


6854 


799 


2585 


4371 


6157 


784CIP2B_4 84 


6857 


800 


2586 


4372 


6158 


784CIP2B 48S | 


6861 


801 


2587 - " 


4373 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 \ 


6875 


803 


2589 


4375 


6161 


784CIP2B 488 


6877 
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OCiU XXJ &t <J - 

of full- 
length 
nucleotide 
sequence 


NO : of 
full- 
length 
peptide 
sequence 


iah*y ID NO : 
of con tig 
nucleotidf* 
sequence 


SEQ ID 

NO : 

oi conxig 
sequence 


Priority 
docket numJber 
cor re spond i ng 

SEO ID NO * i n 

priority 
application 


SEQ ID 

NO: in 

U . S . S . N . 

n Q /a O O T5C 


804 


| 2590 


4376 


6162 


784CIP2B 489 


6880 


805 


2591 


4377 


6163 


784CIP2B 490 


6885 


806 


2592 


4378 


6164 


784CIP2B 491 


68 90 


807 


2593 


4379 


6165 


784CIP2B 492 


6890 


808 


2594 


4380 


6166 


784CIP2B 493 


6894 


009 


2595 


4381 


6167 


784CIP2B 494 


6901 


810 


2596 


4382 


6168 


784CIP2B 4 95 


6904 


811 


2597 


4383 


6169 


784CIP2B 496 


6907 


612 


259B 


4384 


6170 


784CIP2B 497 


6914 


813 


2599 


4385 


6171 


784CIP2B 49fl 


col n 

u31 / 


814 ■ 


2600 


4386 


6172 


784CIP2R 4Q<i 




815 


2601 


4387 


6173 


784CTP2R qnn 




816 


2602 


4388 


6174 


784CIP2B 501 


6931 


8X7 


2603 


4389 


6175 


784CIP2R ^CiO 




818 


2604 


4390 


6176 


784f*TP2R en** 


694 0 


819 


2605 


4391 


5177 


784CTP2R una 




820 


2606 


4 392 


6178 


7ft4r , Y l P'l>u eric;"" 


6946 


821 


260? 


4393 


6 179 


*7fl/l rTDID c O C 

/o4^iir/D bub 


6947 


822 


2608 


4394 


6 180 




694 9 


823 


2609 


4395 


6181 




6959 


824 


2610 


4396 


6 182 


/o^l^lFztJ buy 


6960 


825 


2611 


43 97 


Si85 


J a 4 v — L t'AH oil/ 


6962 


826 


2612 


4398 


6184 




6963 


" 827 


2613 


4399 


6185 




6967 


828 


2614 


4400 


6186 




COQ-3 

6983 


829 


2615 


4401 


6187 


Tfl^PTDOn CIA " 


6988 


83 0 


2616 


4402 


6138 


/O*\.lrio bib 


6996 


831 


2617 


4403 


6189 




7003 


832 


2618 


4404 


6190 


TRinT DOTS ci n 


7016 


833 


2619 


4405 


6191 




7017 


834 


2620 


4406 


6192 


7fi4PTP!?n"'c;i 5 

/ o t \~ X c z> Xzf 


7025 


835 


2621 


4407 


6193 


7flAr , TD*3xa con 


7025 


836 


2622 


4408 


6194 




7025 


837 


2623 


4409 


6 195 


Tfl^PTOn coo 


7050 


838 


2624 


4410 


6196 


TP^PTOQ CO^ " 


7051 


839 


2625 


4411 


6197 


Ifl^PTDOn OA 


7055 


840 


2626 


4412 


6198 




7060 


841 


2627 


4413 


6199 


/ O 1 V~ _!» r Z d 3^£o 


7064 


842 


2628 


4414 


6200 


7B4CIP2B 527 


7067 


843 


2629 


4415 


6201 


784CIP2B 528 


/ U / J. 


844 


2630 


4416 


6202 


784CIP2B 529 


7072 


845 


2631 


4417 


6203 


7B4CIP2B 530 


7073 


846 


2632 


4418 


6204 


784CIP2B 531 


707£ 


847 


2633 


4419 


6205 


784CIP2B 532 


7088 


848 


2634 


4420 


6206 


784CIP2B 533 


708 9 


849 


2635 


4421 


6207 


784CIP2B 534 


70Q1 


850 


2636 


4422 


6208 


784CIP2B 535 


7091 


851 


2637 


4423 


6209 


784CIP2B 536 


7104 


852 


2^38 


4424 


6210 


784CIP2B 537 


7105 


853 


' 2639 


4425 


6211 


784CIP2B S^ft 


/XU3 


854 


2640 


4426 j 


6212 


784PTP2R 


/ ±un 


855 


2641 


4427 


£213 


784CIP2B qAfl " " 




856 


2<?42 


4428 


6214 


784CIP2B 541 


7119 


857 


2643 """" - 


4429 


6215 


784CIP2B 542 


7120 


858 


2644 


4430 


6216 


784CIP2B_S43 


7121 


. 859 


2645 


4431 


6217 


784CIP2B" 544 


7126 


860 


2646 


4432 


6218 


784CIP2B 545 


7127 


861 


2647 


4433 


6219 


784CIP2B 546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 


£221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 


2651 


4437 


6223 


784CIP2B 550 


7163 "i 
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SEQ ID WO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority- 






sequence 






application 




i o a c 

ODD 


2652 


443 8 


6224 


784CIP2B_551 


7175 


867 


2653 


4439 


6225 


784CIP2B 552 


7188 


obo 


2654 


4 440 


62 26 


784CIP2B_553 


7189 


Ob? 


2655 


4441 


6227 


784C1P2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B_555 


7191 


o /J. 


2657 


4443 


6229 


784CIP2B 556 


7203 


872 


2658 


4444 


6230 


784CIP2B 557 


7204 


873 


2659 


4445 


6231 


784CIP2B_558 


7208 


874 


2660 


4446 


6232 


784CIP2B 559 


7209 


875 


2661 


4447 


6233 


784CIP2B 560 


7210 


876 


2662 


4448 


6234 


784CIP2B_ 561 


7216 


877 


2663 


4449 


6235 


784CIP2B_562 


7221 


878 


2664 


4450 


6236 


784CIP2B_563 


7230 


879 


2665 


4451 


6237 


784CIP2B_S64 


7237 


880 


2666 


4452 


6238 


784CIP2B 565 


7240 


881 


2667 


4453 


6239 


784CIP2B_566 


7245 


882 


2668 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B 568 


7251 


884 


2670 


4456 


6242 


784CIP2B_569 


7255 


88S 


26 71 


4457 


6243 


784CIP2B_570 


7260 


686 


2672 


4458 


6244 


784CIP2B 571 


7265 


887 


2673 


4459 


6245 


784CIP2B_572 


7268 


B88 


2674 


4460 


6246 


784CIP2B 573 


7275 


889 


2675 


4461 


6247 


784CIP2B_574 


7279 


890 


2676 


4462 


6248 


784CIP2B_57S 


7283 


891 


2677 


4463 


6249 


78 4CIP2B 576 


7283 


892 


2678 


4464 


6250 


7B4CIP2B_577 


7287 


893 


2679 


4465 


6251 


784CIP2B 578 


73 01 


894 


2680 


4466 


6252 


784CIP2B_579 


73 08 


B95 


2681 


4467 


6253 


- 784CIP2B 580 


7308 


896 


26 82 


4468 


6254 


784CIP2B 581 


7309 


897 


2683 


4469 


6255 


784CIP2B_582 


7319 


898 


2684 


4470 


6256 


784CIP2B 583 


7320 


899 


2685 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 


6258 


784CIP2B 585 


7326 


901 


2687 


4473 


6259 


784CIP2B 586 


7334 " 


902 


2688 


4474 


6260 


784CIP2B 587 


733 7 


903 

— — 


2689 


4475 


6261 


784CIP2B_588 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 


4477 


'6263 


7B4CIP2B 590 


7355 




2692 


4478 


6264 


784CIP2B_591 


7363 


907 


2693 


4479 


6265 


784CIP2B 592 


7363 


908 


.2694 


4480 


6266 


784CIP2B_593 


7365 




2695 


4481 


6267 


784CIP2B 594 


7368 


910 


iD JO 


44 82 


6268 


784CTP2B 59 5 


7369 


911 


«o 2f f 


44 83 


6269 


784CIP2B__596 


7372 


912 


2698 


44 84 


6270 


784CIP2B — 599 


7375 


913 


CO 


4485 


6271 


784CIP2B__600 


7381 


914 




44 86 


6272 


784CIP2B 601 


7383 


915 


2701 


4 487 


6273 


784CIP2B 602 


7387 


916 " 


<s / U*l 


44 88 


6274 


784CIP2B 603 


7391 


y±. t 


2703 


4469 


6275 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 






919 


2705 


4491 


6277 


7B4CIP2B 606 


7397 


920 


2706 


4492 


6278 


784CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 i 


922 


2708 


4494 


6280 


784CIP2B 609 


7406* ""■ 


923 


2709 


4495 


6281 


784CIP2B 610 


7406 


924 


2710 


4496 


6282 


784CIP2B 611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B 6"13 


7411 


927 


2713 | 


4499 


6285 


784CIP2B 614 


7417 j 
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obU ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 

wr» - s\ g 
' Ok. 

full- 
length 
peptide 
sequence 


S3Q ID NO: 
oi. concig 

sequence 


SEQ ID 

NO: 

of con. tig 
pepcxcie 


| Priority- 
docket number^ 
corresponding 

CPft TP* ~Kir\ • < r-i 

olv xd tsv : in 

i- J» «p/J> -i- 1— y 

application 


SEQ ID 
NO: in 
U.S. S.N. 
09/ 488/ /25 


928 


2714 


4500 


6286 


784CIP2B 615 


7418 


929 


271* ■ 


4501 


6287 


784CIP23 616 


7421 


930 


2716 


4502 


6288 


784CIP2B 617 


7422 


931 


2717 


4503 


6289 


784CIP23 6lfi 


7422 


932 


2718 


4504 


6290 


784CIP2B 619 


7423 ' 


93 3 


2719 


4505 


6291 


Tfi/icTPoa fi^n 


7 A O A 


934 


2720 


4506 


6292 


f Qvv^r ZD O -L 


7426 


935 


2721 


4507 


6293 


7fl4PTP? c l 6?? 


74 2 7 


936 


2722 


4508 


6294 


7fl4rTP9S 697 


742 B 


937 


2723 


4509 


6295 




74 3 0 


938 


2724 


4510 


6296 




74 3 5 


939 


2725 


4511 


6297 






940 


2726 


4512 


62 98 


iRAOTMa COT 


74 3 9 


941 


2727 


4513 


6299 


m/lCT'OOXi. CTO 
'OiLlrZO DZO 


744 0 


942 


2728 


4514 


63 00 


/ o f* i_ x f 3 ti^y 


744 2 


943 


2729 


4515 


63 01 


7ndPTD?n Oft 

/ O *± JL tr £. H DjU 


i a c n ^ 


944 


2730 


4516 


63 02 




74 51 


945 


2731 


4517 


63 03 


/OlLlrz p DJZ 


7452 


946 


2732 


4518 


63 04 "" 


/o4v.ii'23 633 


7454 


947 


2733 


451 9 


cine 


/o4v_ J. Jr.eB bjfi 


74 57 


948 


2734 


452 0 


OJ uo 


/o4<-.I.i?2B o3b 


74 59 


949 


2735 


4521 


DjU / 




7461 


950 


273*6 " " ' 


4522 


63 08 


tqaotdoq c: t ^ 
/o4Lli'^H b3 / 


7463 


951 


273 7 


4523 


67 00 
oj US 


/84CIP2B 638 


7466 


952 


2738 


4524 


6"\i'"ft 




7469 


953 


2739 


4525 




/a4C_xP23 o40 


74 73 


954 


2740 


4526 


0.5 JL Z. 


/B4L1 P2B 64 1 


74 81 


955 


2741 


4527 


63 13 


/tf4CIP2fl 642 


7482 


956 


2742 




D -3 J. 4 


/64C1P2B 643 


74 82 


957 


2743 


4529 


6315 


70/1PT DID C /5 /I 


7483 


958 


2744 


4530 


6316 


/ o*sl* J. JrZp o4j 


7485 


959 


2745 


4531 


6317 


TO/ir'Tnoo c a c 
/o4t-IP2B o4s 


BT5 at? 
7486 


96*0 


2746 


4532 


OJlO 


/o4LiP2B 647 


7487 


961 


2747 


4533 


£-11 Q 


/84CIP2B 548 


7491 


962 


2748 


4534 


63 2 0 


/o4^1P2p ©4 9 


7492 


963 '"' 


2749 


4535 




/ o4Llr2a 650 


7494 


964 


2756 


4536 


6322 




7498 


965 


2751 


4537 


6323 




7504 


966 


2752 


4538 


6324 




7508 


967 


2753 


4539 


6325 f 




TCI c 


96" 0" 


2 754 


4540 


6326 


784CIP2B 6"5«T' " 


7518 


969 


2755 


4541 


6327 


7S4CIP2B fi^fi 


7519 


970 


2756 


4542 


6328 


784CIP2B 657 


7521 


971 


2757 


4543 


6329 


784CIP2B 658 


7529 


972 


2758 


4544 


6330 


784CIP2B 659 


7532 


973 


2759 


4545 


6331 


784CIP23 660 


753 3 


974 


2760 


4546 


6332 


784CIP2B 661 


753 5 


975 


2761 


4547 


6333 


784CIP2B 662 


754 5 


976 


2762 " 


4548 


6334 


784CIP23 663 


754 6 


977 


2753 


4549 


633S 


784CIP2B 664 


7552 


978 


2764 


4550 


6336 


784CIP2R 66*? 


/ 33 % 


979 


2765 


4551 


6337 


/ OriV<lr£D ODD 


7567 


980 


2766 


4552 1 


6338 


784ClP23 - 6'6 , 7 


7569 


981 


2767 


4553 


6339 


784CIP2B_668 


7575 


982 


2768 


4554 | 


6340 


784CIP23 669 


7575 


983 


2769 


4555 


6341 


784C1P23 670 


7577 


984 


2770 


4556 


6342 


784CIP2B 671 


7579 


985 


2771 


4S57 


6343 


784CIP23_672 


7582 


986 


2772 


4558 


6344 


784CIP2B 673 


7587 


987 


2773 


4559 


6345 


784C1P23 674 


7589 


988 


2774 


4560 


6346 


784<i:iP2B £7* 


7597 


989 


2773 


4561 


6347 


784CIP2B 676 


9597" 



286 



WO 01/53312 



PCT/USU0/34263 



SEO ID NO • 

of full- 
length 
nucleotide 
sequence 


Ody ID 
NO: of 

full- 
length 
peptide 
sequence 


ot*U ID NO: 
Tin nl pnf 4 H#=» 

sequence 


SEQ ID 

wo : 

or conuly 
npn t" i At> 

fC^i \m x. uc 

sequence 


Priority 
docket numk>er_ 
corresponding 

SEO TD NO - S rt 

priori ty 
application 


SEQ ID 
NO: in 
U . S . S -N . 

n Q /A O Q TIC 


990 


2776 


4562 


6348 


784CIP2B 677 


7609 


991 


2777 


4563 


1 6349 


784CIP23 678 


7609 


992 


2778 


4564 


6350 


784CIP2B 679 


7609 


993 


2 779 


4565 


6351 


784CIP2B 680 


7613 


| 994 


2780 


4566 


6352 


784CIP23 681 


7623 


99S 


27B1 


4S67 


6353 


784CIP23 682 


7629 


996 


2782 


4568 


6354 


784CIP2B 683 


7630 


1 997 


2783 


4569 


£355 " 


784CIP2B 684 


7633 


998 


2784 


4570 


6356 


784CIP2B 685 


7635 


999 


2785 


4571 


6357 


7B4CIP3'R fiflfi 

» W1v*s *D UUD 


763 8 


1000 


2786 


4572 


6358 


784CIP2B 687 


763 9 


1001 


2787 


4573 


6359 


784CIPPB Gflfl 


7C4C 


1002 


2788 


4574 


6360 




764 7 


1003 


2709 


4575 


6361 


784CIP9R can 


/ 0 1 0 


1004 


2790 


4576 


6362 


/ lj \ L tr ZD O 27 JL 


765 8 


1005 


2791 


4577 


$3*3 


f 04\«J.r <iD o y *S 


7664 


1006 


2792 


4578 


6364 


784C?IP9H go'i 


T Hit A 


1007 


2793 


4579 


6365 




7674 


1008 


2794 


4580 


6366 




767 5 


1009 


2795 


4581 


6367 




7676 


1010 


2796 


4582 


6368 


7R4PTP')R Cap 


7681 


1011 


2797 


4583 


6369 




768 8 


1012 


2798 


4584 


6370 




V693 


1013 


2799 


4585 


6371 




7694 


1014 


2800 


4S8b" 






7715 


1015 


2801 


4587 


6373 




7716 


1016 


2802 


4588 


6374 


/o4t,lr2B /04 


7718 


1017 


2803 


4589 


6375 


/ ofi L-LV-^B /05 


7721 


1018 


2804 


4590 


6376 


TPAPTDOD 1 r\C 


7723 


1019 


2805 


4591 


6377 


/ 0 *k \,l.srZ a 1 U / 


7729 


1020 


2806 


4 592 


6378 


*7fl4 r™TT30T5 Tan 


7733 


1021 


2807 


4593 


6379 




7735 


1022 


2808 


4594 


6380 


'jR/irTDoh 7m ' 


7741 


1023 


2809 


4595 


6381 


'04Llr<fiD / J. A. 


7743 


1024 


2810 


4596 


6382 




7748 


1025 


2811 


4597 


6383 




7749 


1026 


2812 


4598 


63 34 


' UVVrXrAD lit 


7750 


1027 


2813 


4599 


6385 


7B4PTP^>B "71 ^ 


T7CJ 


1028 


2814 


4600 


6386 


784CIP2B 77 




1029 


2815 


4601 


6387 


784CIP2B 717 

****** X 17 f JL i 




103 0 


2816 


4602 


6388 


784CIP2B Tl8 " 


s /ou 


1031 


2817 


4603 


6389 


784CIP2B 71 a 


/ /b^ 


1032 


2818 


4604 


6390 


784CIP2B 720 


7765 


2033 


2819 


4605 


6391 


784CIP2B 721 


77CC 


1034 


2820 


4606 


6392 


784CIP2B 722 


7767 


1035 


2821 


4*67 


6393 


784CIP2B 723 


7769 


1036 


2822 


4608 


6394 


784CIP2B 724 


7770 


1037 


2823 


4609 


6395 


784CIP2B 725 


7774 


1038 


2824 1 


4610 


6396 


784CIP2B 726 


7779 


1039 


2825. 


4*11 


$397 


784CIP2B 727 


7781 


104 0 


2826 


4612 


6398 


784CIP2B 728 


7782 


1041 


2827 


4613 


6399 


784CIPPB 739 


7701 


1042 


2828 


4614 


6400 


784CIP2B 136 


7787 


1043 


2829 


4*15 


£401 


784CIP2B_731 


7792 


1044 


2830 


4616 


6402 


784CIP2B 732 


7795 


1045 


2831 


4617 


64 03 


784CIP2B 733 


7801 


1046 


2832 


4618 


6404 


784CiP23_734 


7807 


1047 


2833 


4619 


6405 


784CIP23 735 


7808 ! 


1048 


2834 


4620 


64 06 


784CIP23 736 


7819 


1049 


2835 


4621 


6407 


784CIP2B 737 


7824 


1050 


2836 


4622 


6408 r 


7B4CIP2B_738 J 


7826 


1051 


2837 


4623 


64 09 J 


784CIP2B_739 \ 


7829 
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SEQ XD NOT 
of full- 
length 
nucleotide 
sequence 



1052 



1053 



1054 
1055 



SEQ iD 

NO: of 
full- 
length 
peptide . 
sequence 



2838 



2835 



2840 
2841 



SEQ ID NO: 
of contig 
nucleotide 
sequence 



4624 



4625 



4626 



SEQ ID 
NO: 

of contig 

peptide 

sequence 



64X0 



6411 



Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 



SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



784CIP2B 74 0 



784CIP2B 741 



784CIP2B 743 



7832 



7833 



7847 



1056 



1057 



2847 



2843 



4627 



6413 



4628 



6414 



784CIP2B_74 4 
784CIP2B 745" 



4629 



6415 



784CIP2B 746 



7848 



7853 



7854 



1058 



1053 



1060 
1061 



2844 



2845 



2846 
"28TT 



4630 



6416 



784CIP2B 747 



4631 



6417 



784CIP2B 748 



4632 
4633 



6418 



784CIP2B 749 



7856 
7862 



7865 



1062 



1063 



1064 



1065 



1066 



1067 
1068 



1069 



1070 



1071 



2848 



2849 



2850 



2851 



2852 



2853 



2854 
"2855- 



2856 



2857 



6419 



784CIP2B 750 



4634 



6420 



784CIP2B 751 



4635 



6421 



4636 



6422 



784CIP2B 752 



784CIP2B 753 



4637 



6423 



784CIP2B 754 



463 8 



6424 



784CIP2B 755 



4639 



6425 



784CIP2B 756 



4640 



~4641 



6426 
6427 



784CIP2B 757 



784CIP2B 758 



4642 



6428 



4643 



784CIP2B 759 



6429 



784CIP2B 760 



7874 



7877 



7880 



7882 



7884 



7886 



7888 



7889 



7901 



7910 



7911 



1072 



1073 
1074 



1075 



1076 



1077 



1078 



1079 



1080 



1081 



1082 



1083 



10B4 



1085 



1086 
1087 



2858 



2859 



2860 
2861 



2862 
2863 



2864 



2865 



_2866 
2367 



2868 



2869 



2870 
2871 



2B72 



2873 



4644 
4645" 



6430 



784CIP2B 761" 



'4 646 



784CIP2B 762 



6432 



4647 



784CIP2B 763 



6433 



4648 
"4649 



784CIP2B 764 



6434 
6435 



784CIP2B 765 



"784CIP2B 766 



4650 



6436 



784CIP2B 767 



4651 



6437 



4652 
4653 



784CIP2B 768 



4654 



4655 



4656 
4657 



4658 



4659 



6438 



784CIP2B 769 



6439 



784CIP2B 770 



6440" 



784CIP2B 771 



6441 



6442 



6443 



6444 



6445 



784CIP2B 772 



784CIP2B 773 



784CIP2B 774 



784CIP2B 775 



784CIP2B 776 



7921 



7923 



7924 



7925 



7928 



7929 



7930 



7934 



7938 



7942 



794S 



7946 



7948 



7951 



79S2 
7953 



1088 



1089 



1090 



1091 
1092 



1093 
1094 



1095 



1036 
1097 



1098 



1099 



1100" 



1101 
1102 



103" 



110 

libT" 



2874 



2875 



2876 



2877 
2878 



2879 



2880 



2881 



2882 



2883 



2884 



"2885 



2866 



2887 



2888 



2889 



4660 



6446 



4661 



784CIP2B 777 



6447 



4662 
"4S63 



6448 
6449 



784CIP2b_778 
784CIP2B 77"5T 



784CIP2B 780 



4664 



4665 



4666 



4667 



4668 



4669 



4670 



4671 



4672 



4673 



4674 



4675 



64S0 
6451 



784CIP2B 7B1 



784CIP2B 782 



6452 



6453 



6454 



6455 



6456 



784CIP2B 783 



784CIP2B 784 



784CIP2B 785 



784CIP2B 786 



6457 



6458 



6459 



6460 



6461 



784CIP2B 787 



784CIP2B 788 



784CIP2B 



784CIP2B 



789 
790 



784CIP2B 



784CIP2B 



791 



7954 



7957 



7958 



7961 



7965 



7966 



7979 



7986 



7986 



7988 



7991 



7992 



7992 



7992 



7992 



8003 



1105 



1106 



1107 



1108 
1109 



1110 



1111 
1112 



2890 



2091 



2892 



2893 



"2894 



2895 



2896 



2897 



4676 



6462 



4677 



784CIP2B 793 



6463 



4678 
4679 



6464 



784CIP2B 794 



-64T5- 



784CIP2B 795" 



4680 



6466 



4681 



6467 



4682 
4683 



6463 
6469 



784CIP2B 796 



784CIP2B 797 



784CIP2B 798"" 



784CIP2B 



' 784CIP2B" 



799 
B00" 



8014 



8015 



8016 



8017 



8019 



8020 



8022 



8022 



_2898 
2899 



4684 
4685 



6470 



784CIP2B~~801~ 



8028 



6471 



784CIP2B 802" 



8030 
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SEQ ID NO;" 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


jBU -LU IS (J . 
of Con Ha 

nucleotide 
sequence 


SEQ ID 
Dent ±d^ 

tr tr *• -tv*^ 

sequence 


Priority 

docket number 

corresponding 

CCO TD NO* in 

SCt\4 XL/ ■ iil 

nri ftr*i tv 

-i- -l. v^* j- -i. i— y 

application 


SEQ ID 
NO: in 
U . S . S .N . 

n Q / A O O Tie 

U9/488, 725 


1114 


2900 


4686 


6472 


784CIP2B 803 


8038 


111S 


2901 


4687 


6473 


784CIP2B 804 


O f ** j£ 


1116 


2902 


4688 


6474 


784CIP2B 805 


8045 


1117 


2903 


4689 


£475 


784CIP2B 806 


O v *iO 


1118 


2904 


4690 


6476 


784CIP2B 807 


8046 


1119 


2905 


4691 


6477 


784CIP2B 808 


8047 


1120 


2906 


4692 


6478 


784CIP2B 809 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 


1122 


2908 


4694 


6480 


784CIP2B 811 




1123 


2909 


4695 


6481 


784CIP2B 812 


80 69 


1124 


2910 


4696 


6462 


784CIP2S nil 


o n "7 A 
o u / 1 


1125 


2911 


4697 


6483 


784CIP2P. ftu 


8077 


1126 


2912 


4698 


6484 


i 7B4TTP9R Die 


8078 


1127 


2913 


4699 


6485 


' o^^jLr^D alb 


8079 


1128 


2914 


4700 


6486 




; 8084 


1129 


2915 


4701 


6487 


/O'iLJLJr^D OXO 


808 8 


1130 


2916 


4702 


6488 


7B4PIP5R flIQ 
'OitlrZO OX 


8090 


1131 


2917 


4703 


6489 


1 R 4 f T P "? oon 


8091 


1132 


2918 


! 4704 


6490 




8099 


1133 


2919 


4705 


6491 




8099 


1134 


2920 


4706 


6492 




8100 


1135 


2921 


4707 


6493 




8102 


1136 


2922 


4708 


6494 


TfldrTDOB one 


8103 


1137 


2923 


4709 


6495 




8103 


1138 


2924 


4 710 


6496 


7fl4PTDOtl do'J 


8104 


1139 


292S 


47^1 


6497 


TftdPTDID QOD 
/ O'i^Xfji.D 0£D 


8108 


1140 


2926 


4712 


6498 


7flAPTDOX» ana 


8110 


1141 


2927 


4 713 


6495 


/04LIF2B 830 


8116 


1142 


2928 


4 714 


6500 


/ofl'wXr'^o oil 


8117 


1143 


2929 


4 715 


" " 6501 


7fl«ir'TD'?n~p.o-j 


8123 


1144 


2930 


4716 


S'502 


833 


8130 


1145 


2931 


4 717 


6503 


7A4PTD'3n QQii 


8130 


1146 


2932 


4718 


5504 


/ 0*JH_XJr^.Q Dj3 


8143 


1147 
' 1148 


2933 


4719 


6505 




8143 




2934 


4720 


6506 


/ oiuirZo oJ f 


8154 


1149 


2935 


4721 


6507 


784CTP2B R"*ft 


8155 


1150 


2936 


4722 


6508 


784PTPPR fl^Q 


8162 


1151 


. 2937 


4723 


6 50 9 


784CIP3B Q/in 


ftn d-i 


1152 


2938 


4724 


6510 


784CIP2B 841 


Q1 *7""J 
OX / «£ 


1153 


2939 


4 725 


6511 


784CIP2B_842 


8173 


±±D*± 


294 0 


4726 


6512 


784CIP2B 843 


8179 


1155 


2941 


4727 


6513 


784CIP2B 844 "~ 


8182 


1156 


2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B_84 7 


8185 


1159 


2945 


4731 


6517 


784CIP2B 848 


8187 


1160 "" ~ 


2546 


4732 


6518 


784CIP2B_849 


8188 


j 1161 


2947 


4733 


6519 


784CIP2B 850 


8190 


1162 
1163 


2548 
2949 


4734 
4735 


6520 
6521 


784CIP2B 851 




1164 


2950 


4736 


6522 


784CIP2B 852 
784CIP2B 853 


8193 


1165 
1166 


2951 i 
2952 


4737 
4738 


6523 
6524 


784CIP2B 854 
784CIP2B_855 


8197 
8197 4 


1167 
1168 

1169 


2953 
2954 
2955 


4739 
4 74 0 
4741 


6525 
6526 
6527 


784CIP2B 856 
784CIP2B 857 


8199 
8202 


1170 


2956 


4742 


6528 


784CIP2B 858 
784CIP2B_859 


8203 
8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


J 1173 
1174 

1175 | 


2959 " 
" 2960 ■ 
2961 


4745 
4746 
4747 


6531 
6532 
6533 


784CIP2B__862 
784CIP2B 863 
"784CIP2B 864 " " 


8214 
8217 
8223 
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SEQ ID WO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/4B8,725 


1176 






6534 


784CIP2B 865 


8224 


1177 


^ y o 3 


4 749 


653 5 


784CIP2B 866 


B226 . 


1178 






cttz.1 c 


7B4CIP2B 867 


! 8227 


J. X / 2J 


2 965 


4751 


6537 


784CIP2B 868 


8229 


li-OV 


2 966 


4 752 


6538 


784CIP2B 869 


8232 


11 PI 


2967 


4753 


6539 


784CIP2B 870 


8236 






4754 


6540 


784CIP2B 871 


8239 


J. JL Uj 


2 969 


j 4 755 


6541 


784CIP2B 872 


8244 


1184 


2 970 


4756 


6542 


784CIP2B 873 


8245 


1185 


2 971 


4757 


6543 


784CIP2B 874 


8248 


1186 


2972 


4758 


6544 


784CIP2B_875 


8251 


1187 


2973 


4759 


6545 


784CIP2B_876 


8253 


1188 


2974 


4760 


6546 


784CIP2B 877 


8260 


1189 


2975 


4761 


6547 


784CIP2B_878 


8262 


1190 


2976 


4762 


6548 


7S4CIP2B 879 


B268 


1191 


2977 


4763 


6549 


784CIP2B_B80 


8270 


1192 


2978 


4764 


6550 


784CIP2B_881 


8272 


1193 


2979 


4765 


6551 


784CIP2B 882 


8274 


1194 


2980 


4766 


6552 


784CIP2B 883 


8274 


1195 


2981 


4767 


6553 


784CIP2B 884 


8275 


1196 


2982 


4768 


6554 


784CIP2B_885 


8277 


1197 


2983 


4769 


6555 


784CIP2B 886 


8281 


1198 


2984 


4770 


6556 


784CIP2B_887 


8283 


1199 


2985 


4771 


6557 


784CIP2B_888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 


4773 


6559 


784CIP2B 890 


8300 


12 02 


2988 


4774 


6560 


784CIP2B 891 


8303 


1203 


2989 


4775 


6561 


784CIP2B_892 


8304 


1204 


2990 


4776 


6562 


784CIP2B 893 


8305 


1205 


2991 


4777 


6563 


784CIP2B_894 


8309 


1206 


2992 


4 778 


6564 


784CIP2B_895 


8318 


12 0 7 


2993 


4779 


6565 


784CIP2B_896 


8319 


12 0 8 


2994 


4780 


6566 


784CIP2B_897 


8321 


12 0 9 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


7B4CIP2B_899 


8323 


1211 


2997 


4783 


6569 


784CIP2B 900 


8325 


1212 


2998 


4784 


6570 


784CIP2B_901 


8331 


IaXJ 


2999 


4785 


6571 


784CIP2B_902 


8332 


i 51 it 


3000 


4786 


6572 


784CIP2B_903 


8333 




3 001 


4787 


6573 


784C1P2B 904 


B335 


1216 


3 002 


4788 


6574 


78 4CIP2B 905 


8336 


1217 


3003 


4789 


6575 


784CIP2B 906 


8337 


1218 


3004 


47 90 


6576 


784CIP2B_907 


8340 


1219 


3 005 


4791 


6577 


784CIP2B 908 


8343 


1226 


3006 


4792 


6578 


784CIP2B 909 


8347 ] 


1221 


3007 




6S79 


784CIP2B 910 


8349 


1222 


3008 


4794 


6580 


784CIP2B_911 


8351 


1223 


3 009 


4795 


6581 


784CIP2B_912 


8353 


1224 


3010 




6582 


784CIP2B 913 


B355 


1225 


3011 


4797 j 


6583 


7B4CIP2B 914 


8361 


1226 




4798 


6584 


784CIP2B_915 


8365 


1227 


3013 


47 99 


6585 


784CIP2B 916 


8367 


1228 


3014 


4800 


6586 i 


784CTP2T4 91*7" 


O .3 b J 


1229 


3015 


4801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


6589 


784CIP2B_921 [ 


8391 


' 123 2 


3018 


4804 


6590 


784CIP2B 922 


83 93 


1233 


3019 


4605 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B 924 


8394 


1235 


3021 


4807 


6593 


784CIP2B 925 


" 8395 


1236 


3022 


4808 


6594 


784CIP2B 926 


83 9* 


1237 


3023 • 


4809 


6595 


784CIP2B_927 


8398 
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oc/j XL) £M(J; 

of full- 
length 
nucleotide 
sequence 


SEQ XD 

J.»V ; OC 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucieociae 


SEQ ID 

NO: 

of con tig 
peptide 

tiiJ. 


.; Priority 
docket number^ 
corresponding 

SEQ 1L> NO: in 

pi 1UL 1 L y 

r5*-)Tj") S r*a t" i on 


SEQ ID 
NO: in 
U ,S . S -N. 
09/488, 725 


1238 


3024 


4810 


6596 


784CIP2B 928 


8402 


1239 


3 025 


4811 


6597 


> 0 i v i r z d —> 


oa no 

OH U4 


1240 


3026 


4812 


6598 




84 05 


1241 


3027 


4813 


6599 


764CTP7R P^l 


0*1 u 0 


1242 


3028 


! 4814 


6600 




84 09 


1243 


3029 


4815 


6601 


7fldPTP'9R ^"3 "31 
' a *± v — LrZD ?JJ 


8410 


1244 


3030 


4816 


6602 




8414 


1245 


3031 


4817 


6603 


7S4f*TP?TJ Q"?c; 


84 15 


1246 


3032 


4818 


6604 




8419 


1247 


3033 


4819 


6605 


/ 0 < ±\-XirZ.D 7J / 


84 26 


1248 


3034 


4820 


660 6 




8430 


1249 


3035 


4821 


6607 




8431 


1250 


3036 


4 322 


6608 


fo4t.xF<2B 740 


8432 


i 1251 


3037 


4 823 


66 09 


'0%l»Xir^J3 7<ti 


8433 


1252 


3038 


4 824 


6610 


ToaPTDin 0/10 


8434 


1253 


3039 


4 B25 


OO X X 


/84CXP2B 943 


8438 


1254 


3040 


4826 




/ oIL.XFZB 944 


8439 


1255 


3041 


4827 


6613 


/o4L.XP2B 945 


8441 


1256 


3042 


4 828 


6614 


/B4LIP2B 946 


8450 


" 1257 


3043 


4829 


boia 


7B4CXP2B 947 


8451 


1258 


3044 


4830 




784CIP2B 948 


8452 


1259 


3045 


4 831 


box / 


784CIP2B 949 


8460 


1260 


3046 


4 832 


obit) 


784CIP2B 950 


8461 


1261 


304 7 


483"3 


boxy 


7 B 4 CIP2 B 951 


8462 


1262 


3048 




fab<(JU 


784CIP2B 952 


8464 • 


1263 


3049 




6621 


784CIP2B 953 


8465 


1264 


3050 


4836 




784CIP2B_954 


8467 


1265 


3051 


4 837 




784CIP2B 955 


8470 


1266 


3052 


4 83Q 


bo^l 


784CIP2B 956 


8471 


1267 


3053 


4 839 


6 625 


78 4CXP2B_957 


8473 


1268 


3054 


4840 




7B4C.XP2B 958 


8474 


1269 


3055 


4 841 


OO^ / 


784CIP2B_959 


8475 


1270 


3056 


4842 


O D^O 


784CIP2B 960 


8476 


1271 


3057 


4 843 


CCOQ 

0 dZj 


784CIP2B 961 


8480 


1272 


3058 


4844 


000 U 


784CIP2B — 962 


8482 


1273 


3059 


4 645 


0 do J. 


Toy* riT T"i*1tt n 0 


6482 


1274 


3060 


4 846 


obJz 


784CIP2B 964 


8486 


1275 1 


3061 


4847 


6633 


/a4l~X4r^:B ybb 


8488 


1276 


3062 


4848 


6634 


"7 Q A f"T DOT! £ 


8492 


1277 


3063 


4849 


6635 


/o*tL.XJr*B 7b / 


8494 


1278 


3064 


4850 


£6"3"g 


' 0 ** L-X k'jZti Hbo 


8496 


1279 


30^ 


4851 


6637 


/ OHLifiSl} 7b 7 


8497 


12B0 . 


3066 


4652 


6638 


' Ofl l„JL tT4ZD «? / U 


8499 


1281 


3067 


4 853 


6639 


1 R/iCT'OO'D. Q*71 1 


8513 


1282 


3068 


4854 


6640 


7fl4C?rP7R Q79 

/ O ■» s» JL r<to 37 / a 




12 63 


3069'" " 


4855 


6641 




8526 


1284 


3070 


4856 


6642 


784CIP2R 974 


S3 JX 


1285 


3071 


4857 


6643 


7S4CTPPR 07c 


8533 


1286 


3072 


4858 


6644 


"7fl APT DOt3 Q7C 
/ 0 *i X f £i j / 0 


8 542 


1287 


3073 


4859 


6645 




8544 | 


1288 


" 3074 


4 860 


6646 


/oiLifiD j to 


8565 


1289 


3075 


4861 


6647 


T9APTDTQ Q<ift 


B565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572 


i2di 


3077 


" 4 863 


6649 


784CIP2B 981 


8 576 


1292 


3078 


4864 


6650 


784CIP2B 982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


8584 


1294 


3080 


4 866 


6652 


784CIP2B 984 


SS9b 


1295 


3081 


4867 


6653 


784CIP2B_985 


8602 


1296 


3082 


4868 


6654 


784CIP2B 986 


8604 


1297 


3083 


4869 


6655 


784CIP2B 987 


860 9 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 J 


6657 


"'784eiP2B 989 


8637 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 

full- 
length 
peptide 
sequence 


I ^ T?n t r» mo . 
i pay j, u nu ; 

1 of con h •? a 

nucleotide 

sequence 


SEQ ID 

r*U : 

wi- concxg 

peptide 

sequence 


Priority 
docket number_ 

correapanuing 
SEO ID NO • in 

«i Wi- V- \f 

application 


SEQ ID 
NO ; in 
U . S . S . N . 
U3/ * 0 0 / /^ib 


1300 


3086 


4872 


6658 


7Q4CIP2B 990 


864 0 


[ 1301 


3087 


4873 


6659 


784CIP2B 991 


8643 


1302 


3088 


4 874 


6660 


784CIP2B 992 


8645^ 


1303 


3089 


4875 


6661 


784CIP2B 993 


8650 


1304 


3090 


4 876 


6662 


784CIP2B 994 


8651 


| 1305 


3091 


! 4877 


1 6663 


784CIP2B 995 


8654 


1306 


3092 


4878 


6664 


784CIP2B 996 


8655 ""' 


13 07 


| 3093 


4879 


6665 


784CIP2B 997 


8657 


1308 


3094 


4880 


6666 


7B4CIP2B 998 


8665 


1309 


3095 


4881 


6667 


784CIP?R 999 


QCCO 
OOOO 


1310 


3096 


4882 


6668 




OCT1 
ob / I 


j 1311 


3097 


4883 


6669 


784CIPPR mm 


8672 


1312 


3098 


4884 


6670 


7fl4PTP?R 1 flO^ 


8692 


1313 


3099 


4885 


6671 




87C6 


| 1314 


3100 


4886 


6672 


/□'Jv.iria 1UU4 


8716 


1315 


3101 


4887 


6673 


7fidrTD9Q innc 


871 9 


1316 


3102 


4888 


6674 


794C!TP5n mnfi 

iOvv.ir£o XUUo 


8743 


1317 


3103 


4889 


6675 




8764 


1318 


3104 


4890 


6676 


7R4PTP7R inno 


876~4 


1319 


3105 


4891 


6S77 


7fl^rTD7n irtno 


8764 


1320 


3106 


4892 


6678 




8774 


1321 


3107 


4893 


6679" 




8782 


1322 


3108 


4894 


6680 


'7R/ir"TT3"5t3 1 rn O 


8796 


1323 


3109 


4895 


6681 


7RArTD")ia 1 n"i *a 


8827 


1324 


3110 


469£ 


6682 


/ O^LlF^b JLUJL4 


6842 


1325 


3111 


4897 


6683 


/t»4uii»2B 1015 


8842 


1326 


3112 


4 896 


6684 


/ 0«±V*.XirZ.fcJ XUJLo 


8858 


1327 


3113 


4899 


6685 


1 BAPTP7P imn 

/ -LUX / 


8971 


1328 


3114 


! 4900 


66 86* 


"JQAPTDIO l n 1 0 
/O4L1^0 XUXO 


8921 


1329 


3115 


1 496l 


6687 


TflAPTDTtl 1 An a 


8927 


1330 


3116 


4902 


6688 


7fldrTP")n 1 nm 
'Oiv^xjr^o lUZU 


8942 


1331 


3117 


4903 


6689 


"7 fl A f"*T D 0 n t nrn 


8994 


1332 


3118 


4904 


6690 




9023 


1333 


3119 


4905 


6691 




9028 


1334 


3120 


4906 


6692 




9058 


13 35 


3121 


4907 


6693 




9058 


1336 


3122 


4908 


6694 




9079 


1337 


3123 


4909 


6695 


784CIP2B 1097 


9079 


'"■ 133 8 ■ 


3124 


4910 


6696 


784CIP7B i n^fl 


9082 


1339 


3125 


4911 


6697 


784CIP2B 1029 


9084 


1340 


3126 ~j 


4912 


6698 




9 0 93 


1341 


3127 


4913 


6699 


784ClP2B 1031 


7XUX 


1342 


3128 


4914 


6700 


784CIP2B 1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B 1033 


J X U D 


1344 


3130 


4916 


6702 


784CIP2B 1034 


9151 


1345 


3131 


4917-— ■ 


6703 


784CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B 1036 


91 72 


1347 


3133 


4919 


6705 


784CIP2B 1037 


9174 


1348 


3134 


4920 


6706 


784CIP2B 1038 


92 04"™ 


1349 


3135 


4921 


6707 


784CIP2B 1039 


923 4 


1350 


3136 


4922 


6708 


784CIP2B 1040 




1351 


3137 


4923 


6709 


784CIP2B 1041 


y z 0 y 


1352 


3138 


4924 


6710 


784CIP2B 1042 " 


92 56 - ■ 


1353 


3139 


4925 


£711 


7B4CIP2B 1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B 1044 " 


9345 


1355 


3141 


4927 


6713 


784CIP2B 1045 


9379 | 


1356 


3142 


4928 


6714 


784CIP2B 1046 


9435" 1 


1357 


3143 


4929 


6715 


7B4CIP2B 1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B 104 9 


9500 


1360 


314 6 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 


784CiP2B 1051 


9520' 
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SEQ tD NO:' 
of full- 
length 
nucleotide 
sequence 


'" SEQ" ID 
NO: of 
full- 
length 
peptide 
sequence 


cro Tr> Mn. 

of ooTifcier 

nucleotide 
sequence 


SEQ ID 

NO ■ 

of font" i a 

peptide 
sequence 


Priority 
docket nutnber__ 
c ore e upon cl i ng 
SEO TD NO* in 

priority 
application 


SEQ ID 
NO : in 
U . S « S . N . 


1362 


3148 


4934 


6720 


784CIP2B_1052 


9541 


1363 


3149 


4935 


6721 


784CIP2B 1053 


9541 


1364 


3150 


4936 


6722 


784CIP2B 1054 


9548 


1365 


3151 


4937 


6723 


784CIP2B 1055 


9556 


1366 


3152 


4938 


6724 


784CIP2B 1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B 1057 


9575 


1368 


3154 


4940 


6726 


784CIP2B 1058 


9589 


1369 


3155 


4941 


6727 


784CIP2B 1059 


9599 


1370 


3156 


4942 


6728 


784CIP2B 1060 




137"1 


3157 


4943 


6729 


784CIP2B 1061 


9606 


1372 


[ 3158 


i 4944 


6730 


784CIP2B 1062 




1373 


3159 


4945 


6731 


784CIP2B 1065 




1374 


3160 


4946 


6?32 


784CIP2B 1064" 


9646 


1375 


3161 


4947 


6733 


784CIP2B 1065 


79 f*k 1 


1376 


3162 


4948 


6734 


784CIP2B 7 0 66 


QT71 


1377 


3163 


4949 


6735 


784CIP2B 3.067 




1378 


3154 


4950 


6736 


784CIP2B 106R 


QDA 1 


1379 


3165 


4951 


6737 


784CTP9B T n^Q 


9811 


I 1380 


3166 


4952 


6738 


7flAfTP9R 1 rtTA 


9843 


| 1381 


3167 


'4953 


6739 


"7fliiPTD'5'n i mi 


9854 


1382 


3168 


4954 


6740 


/ D *« J. tr^o A. KJ i z. 


9854 


13B3 


3169 


4955 " ' ~ 


6741 


/ O H. V, 1 f U XV I 3 


9864 


13B4 


3170 


4956 


6742 


7flArTD*>n ~\ r\n a 


9864 


1385 


3171 


4957"" 


6743 


"7fl4 f TD^B 10*7^ 


9871 


1386 


3172 


4958 


6744 


784CTP2P. 1076 


9879 


1387 


3173 


4959 


~G74$ 




9681 


1388 


3174 


4960 


6746 


1 mo 


y oob 


1389 


3175 


4961 


6747 




9901 


1390 


3175 


4962 


674 8 


(O^k.i.fAO 1UOU 


9912 


1391 


3177 


4963 


6749 


7fl4fTP5n i Aoi 


5bi £ 

y 9i 6 


1392 


3178 


4964 


6750 




9921 


1393 


3179 


4965 


6751 




9925 


1394 


3180 


4966 


6752 




993 0 


1395 


3181 


4967 


6753 




9949 


1396 


3182 


496*8 


6754 


7Q4r , TDon inae 


9951 


1397 


3183 


4969 


6755 




9959 


1398 


3184 


4970 


6756 




9973 


1399 


3185 


4971 


5757 


784CIP2B 1089 




1400 


3186 


4972 


6758 


784CIP2B 1090 




1401 


3187 


4973 


5759 


784CIP2B 1091 


10021 


1402 


3188 


4974 


6760 


784CIP2B 1092 


10041 


1403 


3189 


4975 


6761 


784CIP2B 1094 


10067 


1404 


3190 


4976 


6762 


784CIP2B 1095 


10073 


1405 


3191 


4977 


6763 


7B4CIP2B 1096 


10112 


1406 


3192 


4978 


6764 


784CIP2B 1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B 1098 


10132 


1408 


3194 


4980 


6766 


784CIP2B 1099 


10169 


1409 


3195 


4981 


6767 


784CIP2B 1100 


10217 


1410 


3196 


4982 


6768 


784CIP2B 1101 


10226 


1411 


3197 [ 


4983 


6769 


784CIP2B 1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 " 


1413 


3199 


4985 


6771 


784CIP2B 7 104 


10279 


1414 


3200 


4986 


6772 


784CIP2C_1 


33 


1415 


3201 


4987 


6773 


784CIP2C 2 


271 


1416 


3202 


4988 


6774 


784CIP2C 3 


848 


! 1417 


" 3203 


4989 


6775 


784CIP2C 4 


849 


1418 


" 3204 


4990 


6776 


784CIP2C_5 


864 


1419 


3205 


4991 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 ~ 


784CIP2C 7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595 


1422 


3208 


4994 


" 6780 


784CIP2C 9 


1697 


1423 


3209 


4995 


6781 


784CIP2C 10 


1744 
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SEQ ID NO: 
o£ full- 
length 
nucleotide 
sequence 


NO: of 
full- 
length 
peptide 
sequence 


of contig 

nucleotide 

sequence 


NO : 

of contig 

peptide 

sequence 


Priority 

rinrVoh mimKor 
rTiT"T*P* RT"JODc! i r\a 

V-» V-/ A- JL C D^/UliUXiiVj 

SEQ ID NO: in 

priority 

application 


SEQ ID 
vi\j : in 
U . S S.N 
09/488 725 


" 1424 


3210 


4996 


6782 


784CIP2C_11 


1937 


1425 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3215 


5001 


6787 


784CIP2C_16 


2901 


1430 


3216 


5002 


6788 


784CIP2C_17 


2902 


1431 


3217 


5003 


6789 


784CIP2C__18 


2905 


1432 


3218 


j 5004 


6790 


784CIP2C 19 


2948 


1433 


3219 _ 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


784CIP2C 21 


2959 


1435 


3221 


5007 


6793 


784CIP2C 22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C 24 


2970 


1430 


3224 


5010 


6796 


784CIP2C 25 


298S 


1439 


3225 


5011 


6797 


784CIP2C 26 


2987 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


784CIP2C 28 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


1443 


3229 


5015 


6801 


784CIP2C 30 


3046 


I 1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


1445" 


3231 


5017 


6803 


784CIP2P 35 


33 


1446 


3232 


5018 


6804 


784CIP2C 33 


3359 


1447 


3233 


5019 


6805 


784CIP2C 34 


3432 ™ 


1448 


3234 


5020 


6806 


784CIP2C 35 


343 8 


1449 


3235 


5021 


6807 


7B4CIP2C 36 


343 9 


1450 


3236 


5022 


6808 


7B4CIP2C! 39 




1451 


3237 


5023 


6809 




34 6 6 


1452 


3238 


5024 


6310 


784CIP2C 41 


3466 


1453 


3239 


5025 


6311 


784C3P2H 49 


3467 


1454 


3240 


5026 


6312 


784CIP2C 43 


346 B 


1455 


3241 


5027 


6813 


784CIP2C 44 


3483 


1456 


3242 


5026 


6814 


784CIP2C 45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 4£ 


3488 


1458 


3244 


5030 | 


6816 


784CIP2C 47 


3491 


1459 


3245 


5031 


6817 


784CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C 49 


3494 


1461 


3247 


5033 


6819 


784CIP2C 50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3 503 


1464 


3250 


5036 


6822 


7B4CIP2C_53 


3503 


1465 


3251 


503/ 


6823 


784CIP2C 54 


3504 


1466 


32$2 


5038 


6824 


784CIP2C_55 


3511 


1467 


3253 


5039 


6825 


784CIP2C 5.6 


3 531 


1468 


3254 


5040 


6826 


784CIP2C 57 


3536 


1469 


3255 


5041 


6827 


784CIP2C 58 


3546 


1470 


3256 


5042 


6828 


784CIP2C_59 


3548 


1471 


3257 


5043 


6829 


784CIP2C 60 


3551 


1472 


3258 


5044 


6830 


784CIP2C 61 


3553 


1473 


3259 


5045 1 


6831 


784CiP2d 62 


3564 


1474 


3260 


5046J 


6832 


784CIP2C 63 


3567 


1475 


3261 


5047 


6833 


784CIP2C 64 


3572 


1476 


3262 


5048 


6834 


784CIP2C 65 


3573 


1477 


3263 


5049 


6835 


784C1P2C 66 


3574 


L 1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3266 


5052 


6638 


784CIP2C 69 


3623 


1481 


3267 


5053 


6839 


784CIP2C_70 


3629 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


32*9 


5055 


6841 


784CIP2C 72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corre sponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


146 6 


3 2 72 


5058 


6844 


784CIP2C 75 


3924 


14 87 




5059 


6845 


784CIP2C 76 


| 3928 


14 83 


~x *y n a 


5060 


6846 


784CIP2C 77 


3935 


140a 


J fD 


5061 


6847 


784CIP2C 78 


3959 


1490 


'O 


5062 


6848 


784CIP2C 79 


3981 


1491 


-5^ / / 


5063 


684 9 


784CIP2C_80 


3989 


1492 


3 2 7 B 


5064 


6850 


784CIP2C 81 


4295 


1493 


J A f y 


5065 


6851 


784CIP2C_82 


4300 1 


1 A Q A 


32 80 


5066 


6852 


784CIP2C_83 


4360 


14 95 


3281 


5067 


6853 


784CIP2C 84 


4362 


14 96 


3282 


5068 


6854 


784CIP2C 85 


4371 


14 97 


3283 


5069 


6855 


784CIP2C 86 


4373 1 


14 98 


3284 


5070 


6856 


784CIP2C_87 


4376 


14 99 


3285 


5071 


6857 


784CIP2C 89 


4378 


1500 


3286 


5072 


6858 


784CIP2C_90 


43S2 


2 5 01 


3 287 


5073 


68S9 


784CIP2C 91 


4409 


1502 


3288 


5074 


5860 


784CIP2C_92 


4421 


1503 


3289 


5075 


S861 


784CIP2C 93 


4421 




3290 


5076 


6862 


784CIP2C 94 


4426 


1505 


3291 


5077 


6863 


784CIP2C 95 


4430 


J.3UO 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C_97 


4436 


TgrTS 


3294 


5^080 


6866 


7S4CIP2C 98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1510 


3296 


5082 


6868 


784CIP2C_100 


4441 | 


1511 


3297 


50B3 


6869 


784CIP2C 101 


4442 


1512 


3298 


5084 


6870 


784CIP2C 102 


4455 


1513 


3299 


5085 


6971 


784CIP2C 103 


4462 


1514 


33 00 


5086 


6872 


784CIP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C 105 


4469 


151 6 


3302 


5088 


6374 


784CIP2C_106 


4477 




33 03 


5089 


6B75 


784CIP2C_107 


4481 




33 04 


5090 


6076 


784CIP2C_108 


4483 




3305 


5091 


6877 


784CIP2C_109 


4484 


t Kin 


33 06 


5092 


6878 


784CIP2C_110 


4486 




-i'-TrTn 


5093 


£879 


784CIP2C_111 


4490 


1 c no 


33 08 


5094 


6880 


784CIP2C 112 


4499 




3309 


5095 


6881 


784CIP2C 113 


4503 


2524 


3310 


5096 


6882 


784CIP2C 114 


4506 


15"25 


3311 


5097 


6883 


784CIP2C_115 


4509 


i52^ 


3312 


5098 


6884 


784CIP2C 116 


4514 


1527 


tit ■» 


5099 


6885 


784CIP2C 117 


4516 


1528 


J J 14 


5100 


6886 


784CIP2C 118 


4522 


1529 




5101 


6887 


784CIP2C 119 


4525 


1530 i 


33 1£ 


5102 


6888 


784CIP2C 120 


4527 


1531 




5103 


6889 


784CIP2C_121 


4528 


1532 


3318 




6 890 


784CIP2C_122 


4529 


1533 


3319 


tine: 


6 891 


784CIP2C_12 3 


4532 


1534 "■■ " 




5106 


68 92 


7B4CIP2C 124 


4537 


1535 


3321 


31U / 


6893 


7B4CIP2C__125 


4538 


1536 


-J O ^ *i 


5108 


6894 


784CIP2C 126 


4551 


1537 




5109 


6895 


784CIP2C 127 


4552 


1538 


3324 


5110 


6896 


7H4CTP2H l^ft 

/ 01\»lfiiL. J. Z. O 


/ceo — 


1539 


3325 


5111 


6897 


784CIP2CJL29 


4567 


1540 


3326 


5112 


6898 


784CIP2C 130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1542 


' 3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 | 


4609 


1544 


3330 


5116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 j 3333 


5119 


6905 


784CIP2C 138 


4620 
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SEQ ID NO: 
on ruii- 

nnrlpoh ir?f» 
sequence 


SEQ ID 
NO : of 
full* 

pep hide 
sequence 


S3Q ID NO: 
of con tig 
nucleot ide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket nutnber_ 
cor re spondi ng 
SEQ ID NO: in 

priority 
di-jfp j. 1 Let c i on 


SEQ ID 
NO: in 
U . S . S . N . 
09/488, 725 


1548 


3334 


5120 


6906 




ACTA 


1549 


3335 


5121 


6907 




4 632 


1550 


3336 


| 5122 


6908 




4 634 


1551 


3337 


5123 


6909 


TflAHTOr TAT 


4638 


1552 


3338 


5124 


6910 


7QirTD^r 1 A - * 


4639 


1553 


3339 


5125 


O J JL X 


•7 fl A r*T 15 0 r* "IAA 


4643 


1554 


3340 


5126 ' 


6912 


TQAHTniP 1 /l C 


4644 


1555 


3341 


5127 


0 J.! J 


«o*iLXr^i, 14b 


4655 


1556 


3342 


5128 


6914 


"7 Q A hTBOr "1 A *7 


" /idea — 
4668 


1557 


3343 


5129 




/04L1P2C 149 


46*77 


1558 


3 3 44 


5130 


col e 




4677 


1559 


3345 


* OlJ J. 


6917 


/B4L1P2C 150 


4677 


1560 


3346 


5132 




f 84Cli?2t._152 


4682 


1561 


3 34 7 


D 1 J J 


6919 


784C1P2C- 153 


4690 


1562 


3 348 


5134 




/84CIP2C 154 


4691 


1563 


3 349 


9 JU09 


6921 


/84CIP2C 155 


4727 


1564 


3 350 




6922 


784CIP2C 156 


473 0 


1565 


3 3 51 


5137 


6923 


784CIP2C 157 


4734 


1566 


3352 




6924 


784CIP2C 158 


4757 


1567 " " 


3353 


5139 


6925 


784CIP2C 159 


4764 


1568 "" 


■a "i ca 
o o 


5140 


6926 


784CIP2C 160 


4786 


1569 


c 


5141 


6927 


784CIP2C 161 


4793 


1570 


3356 




6928 


784CIP2C__162 


4825 


1571 


"3 s * 5 7 


5143 


6929 


784CIP2C 163 


4826 


1572 


3358 


5144 


cZQ-in 

6930 


! 784CIP2C 164 


j 4850 


1573 


rRq 


5145 


6931 


784CIP2C 165 


4853 


; 1574 


33 60 


5146 


6932 


784CIP2C 166 


4855 


2575 


3361 


5147 


6933 


■ 784CIP2C 167 


4856 


1576 




5148 


6934 


784CIP2C X68 


4867 


1577 




5149 


6935 


784CIP2C 169 


4869 


1578 


33 64 


5150 


6936 


784CIP2C_170 


4878 


1579 


3365 


5151 


6937 


784CIP2C 171 


4880 


1580 


3366 




6938 


784CIP2C 172 


4942 


1561 


3367 


5153 


693 9 


784CIP2C_173 


4945 


1582 


33 68 


5154 


6940 


784CIP2C 174 


4950 


1583 


3369 




"694 1 


784CIP2C 175- 


4952 


1584 


3370 




6942 


784CIP2C 176 


4954 


1585 


3371 


5157 




/04L1P2L 1 77 


4956 


1585 


3372 


5158 




784GIPZC 178 


4961 


1587 


3373 


5159 


6945 


i04LirAL 1 / 17 


5590 


1588 


3374 ™ 


5160 


6946 


/04LlrZU loll 


5599 


1589 


3375 


5161 


6947 




5692 


1590 


3376 


5162 


6948 


/CtLlrZL io« 


5732 


1591 


3377 


5163 


6949 




5765 


1592 


3378 


51^4 


6950 


/04^1F^V< 1B4 


5771 


1593 


33 79 


5165 


69S1 


/04^1f4t XO«> 


5774 


1594 


3360 


5166 


6952 


TRAPTDOP 1 ft< 


5793 


1595 


3381 


5167 


6953 


*7ftdPT i an 


S806 


1596 


3382 


~51?8 


£954* 


IBAPTOOP 1 aa "~ 
/04Llr^L lQS 


5852 


1597 


3383 


5169 


6955 


/o4v,lr^L I07 


5892 


1598 


3384 


5170 


D3 JO 


/□4Lif ZL 190 


6057 


1599 


33 85" " 


5171 


£ OCT 


/B4CIP2C 191 


6061 


1600 


3386 


5172 


6958 


784CIP2C 192 


6109 


1601 


3387 


5173 


6959 


784CIP2C_193 


6160 


1602 


3388 


5174 


6960 


784CIP2C 194 


6297 


1603 


3389 


5175 


6961 


784CIP2C_195 


6398 


1604 


"3390 " 


5176 


6962 


784CIP2C_196 


639S 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


7B4CIP2C_198 


6448 


1607 


3393 


5179 


6965 


784CIP2C_199 


6469 


160B 


3394 


5180 


*96"6 


784CIP2C 200 


6 , 476' 


1609 


3395 


5181 


6967 


784CIP2C 201 


6561 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
cor re spond i ng 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO:in 
U.S. S.N. 
09/488, 725 


1610 


3396 




coca 


/ o **. CI r ZC 2 0 2 


6574 


JLO J> J. 


3397 


5183 


6969 


784CIP2C 203 


6578 


TCI T 

AO JL Z 


1 1 Q a 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


3 JLOO 


6971 


784CIP2C 205 


6672 


1614 




5186 


6972 


784CIP2C_206 


6691 






5187 


6973 


784CIP2C 207 


6695 


1616 


3402 


5188 


can"* 

6974 


784CIP2C 208 


6746 


i CT7 
J-O JL / 




5189 


6975 


784CIP2C 209 


6898 


10J.0 




5190 


6976 


7S4CI?2C_210 


6938 


1619 


3405 


5191 


6977 


784CIP2C_2ll 


6943 


162 0 


3406 


5192 


6978 


7B4CIP2C 212 


7110 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6980 


784CI?2C_214 


7212 


1623 


3409 


5195 


6981 


7B4CIP2C 215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 216 


7249 


1625 


3411 


5197 ~~ 


6983 


784CIP2C 217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_218 


7509 


1627 


3413 


5199 


6985 


784CIP2C 219 


7523 


1628 


3414 


5200 


6986 


784CIP2C_220 


7544 


1629 


3 415 


5201 


6987 


784CIP2C_221 


7564 


1630 


3416 


5202 


6988 


784CIP2C 222 


7568 


1631 


3417 


5203 


6989 


784CIP2C 223 


7631 


TcTt! 

16 $2. 


3418 


5204 


6990 


784CIP2C 224 


7813 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_228 


7943 


1637 


3423 


5209 


6995 


784CIP2C_229 


8175 


1638 


3424 


5210 


6996 


784CIP2C_230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


784CIP2C 232 


8271 


1 CA 1 
lb H 1 


3427 


5213 


6999 


784CIP2C_233 


8397 


1642 


3428 


5214 


7000 


784CIP2C 234 


8466 


1643 


3429 


5215 


7001 


784CIP2C_23 5 


8503 


1 CAA 


3430 


S216 


7002 


784CIP2C 236 


8953 


•i c/i c 


3431 j 


5217 


7003 


784CIP2C_237 


9106 


164 6 


3 432 


5218 


7004 


784CIP2C_23 8 


9139 


1647 


3433 


5219 


7005 


784CIP2C_239 


955S 


1648 


1 A \ A 


5220 


7006 


784CIP2C_24 0 


9650 


1649 


34 35 


5221 


7007 


784CIP2C_241 


9889 


i 1650 


JO 


5222 


7008 


784CIP2C 242 


9933 


1651 


-i *i J / 


5223 


7009 


7B4CIP2C 243 


9953 


i 1652 


343 8 


5224 


7010 


784CIP2C_244 


9981 


1653 




5225 


7011 


784CIP2D 1 


746 


1 654 


3440 




7012 


784CIP2D 2 


3558 


1655 


3441 


522 7 


7013 


784CIP2D 3 


3553 


1656 


3442 


^•5*5 Q ( 


7014 


784CIP2D 4 


3633 


1657 


3 443 




7015 


784CIP2D_5 


3658 


1658 


3 444 


J U 


7016 


784CIP2D 6 


3732 


1659 ' " 


3445 


5231 


7017 


784CIP2D 7 


4004 


" 1660 


^4-4 c 

J" TV 


5232 


7018 


784CIP2D_B 


4700 


1661 




5233 


7019 


784CIP2D 9 


4703 


1662 


3448 


5234 


7020 






1663" 


3449 


5235 


7021 | 


784CIP2D 11 


4894 


1664 


3450 " 


. 5236 


7022 


784CIP2D_12 


4918 1 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


3452 


5238 


7024 


784CIP2D_14 


7443 


1^67 


345* 


5239 


7025 


784CIP2D_15 


8673 


1668 


3454 


5240 


7026 


784CIP2D 16 


8679 ~| 


1669 


3455 


5241 


702 7 


784CIP2D_17 


8727 


1670 


3456 


5242 


7028 ' " 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; Of 

full- 
length 
peptide 
sequence 


<*)£ fonH a 

nucleotide 
sequence 


SEQ ID 
NO • 

of fflTl hio 

peptide 
sequence 


Priority 
docket number 

Cut x. C &fc/t m /mjLj,iiKj 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO : in 
U . S . S . N . 
09/4HR 751? 


1672 


3458 


5244 


7030 


784CIP2D_20 


8018 


1673 


3459 


5245 


7031 


784CIP2D_21 


8844 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


5247 


7033 


784CIP2D_23 


8912 


1676 


3462 


5248 


7034 


784CIP2D 24 


8918 


1677 


3463 


5249 


7035 


j 784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


784CIP2D 26 


8941 


1679 


3465 


5251 


7037 


784CIP2D 27 


8941 


1680 


3466 


5252 


7038 


784CIP2D 28 


8951 


1681 


3467 


5253 


7039 


784CIP2D 29 


8951 


' 16B2 


j 3468 


5254 


7040 


784CIP2D 30 


9007 


1683 


| 3469 


5255 


7041 


784CIP2D 31 


9012 


1684 


3470 


5256 


! 7042 


784CIP2D 32 


9013 


1685 


3471 


5257 


7043 


784CIP2D 33 


9025 


1686 


3472 


5258 


7044 


784CIP2D 34 




1687 


3473 


5259 


7045 


784CIP2D 35 


9054 


1688 


3474 


5260 


7046 


784C11P2T) Ifi 

/ (J N p, L Jr «£ LJ O 




1689 


3475 


5261 


7047 






1690 


3476 


5262 


7048 


784CIP2D 38 




1691 


3477 


5263 


7049 


784CIP2D 39 




1692 


3478 


5264 


7050 


784CIP2D 40 


9152 


1693 


3479 


5265 


7051 


784CIP2D 41 




1694 


3480 


5266 


7052 


784CIP2D 42 


9223 


1695 


3481 


5267 


7053 


784CIP2D_43 


9223 


1696 


3482. 


5268 


7054 


784CIP2D 44 


9231 


1697 


3483 


5269 


7055 


784CIP2D_45 


9236 


1698 


3484 


5270 


7056 


784CIP2D 46 


9236 


1699 


3485 


5271 


7057 | 784CIP2D_47 


9303 


1700 


3486 


5272 


7058 


784CIP2D 48 


9309 


1701 


3487 


5273 


7059 


784CIP2D_4 9 


9314 


1702 


3488 


5274 


7060 


784CIP2D_50 


9326 


1703 


3489 


5275 


7061 


784CIP2D_51 


9339 


1704 


3490 


5276 


7062 


784CIP2D_52 | 934 8 


1705 


3491 


5277 


7063 


784CIP2D 53 937$ 


1706 


3492 


5278 


7064 


7 84CIP2D_54 


9382 


1707 


3493 


5279 


7065 


784CIP2D 55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


349S 


. 5281 


7067 


784CIP2D 57 


9439 


1710 


349S 


5282 


7068 


784CIP2D 58 


94 85 


1711 


3497 


5283 


7069 


784CIP2D 59 


94 93 


1712 


3498 


5284 


7070 


784CIP2D 60 


9501 


1713 


3499 


5285 


7071 


784CIP2D_61 


952* 


1714 


3500 


5286 


7072 


784CIP2D 62 


9526 


1715 


3501 


5287 


7073 


784CIP2D 63 


9551 


1716 


3502 


5288 


7074 


784CI?2JD_64 


9557 


1717 - 


3503 


5289 


7075 


784CIP2D £5 


9568 


1718 


3504 


5290 


7076 


784CIP2D 66 


9588 


1719 


3505 1 


5291 


7077 


784CI?2D_67 


9597 


1720 


3506 


5292 


7078 


784CIP2D 68 


9615 


1721 


3507 


5293 


7079 


784CIP2D 6$ 


9628 


1722 


3508 


5294 


7080 


784CIP2D_70 


9649 


1723 


3509 


5295 


7081 


784CIP2D_71 


9652 


1724 


3510 


5296 


7082 


784CIP2D 72 


9660 


1725 


3511 


5297 


7083 


784CIP2D 73 


9662 


1726 


3512 


5298 


7084 


7B4CIP2D 74 


9725 . 


L 1727 
^ i728 


3513 


5299 


7085 


784CIP2D 75 


9746 




3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 - 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


784CIP2D 80 


9842 


1733 


3519 


5305 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


rCJLOllLy 

coirre snonrf i via 
SEQ ID NO: in 
priority 
application 


SEQ ID 
xmv : in 
U . S -S N . 
09/488, 725 


1734 


3520 


5306 


7092 


784CIP2D 82 


9867 


1735 


3521 


5307 


7093 


784CIP2D 83 


, 10010 


1736 


3S22 


5308 


7094 


784CIP2D_84 


10011 


1737 


3523 


5309 


7095 


784CIP2D 85 


100^2 


1738 


3524 


5310 


709tf 


784CIP2D_86 


10057 


1739 


3525 


5311 


7097 


784CIP2D 87 


10085 


1740 


3526 


5312 


7098 


784CIP2D_89 


10139 


1741 


3527 


5313 


7099 


784CIP2D_90 


10142 


1742 


3528 


5314 


7100 


784CIP2D_92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


10273" 


1746 


3532 


5318 


7104 


784CIP2E 1 


3 121 


1747 


3533 


5319 


7105 


784CIP2E 2 


3628 


1748 


3534 


5320 


7106 


784CIP2E 4 


3 673 


1749 


3535 


5321 


7107 


784CIP2E 5 


4 018 


1750 ' 


3536 


5322 


7108 | 784CIP2E 6 


4467 


1751 


! 3537 


5323 


7109 




4 865 


1752 


3538 


5324 


7110 


1 TflAfTBOlT ft 


4916 


1753 


3539 


5325 


7111 




4923 


1754 


3540 


5326 


7112 


784CIP2E 10 


4926 


1755 


3541 


5327 


7113 


784CIP2E 11 


4962 


1756 


3542 


5328 


7114 


784CIP2E 12 


4963 


1757 


3543 


5329 


7115 


784CIP2E 13 


4964 " 


1758 


3544 


5330 


7116 


784CIP2E_14 


d ODD 

** y o o 


1759 


3545 


5331 


7117 


784CIP2E 15 




17^0 


3546 


5332 


7118 


784CIP2E_16 


7682 


1761 


3547 


5333 


7119 


784CIP2E 17 


7682 


1762 


3548 


5334 


7120 


784CIP2E 18 


7ST9 


1763 


3549 


5335 


7121 


784CJIP2E 19 ■ 


7707 


1764 


3550 


533S 


7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 


77 K 2 


1766 


3552 


533 6 


7124 


784CIP2E 22 


8357 


1767 


3553 


5339 


7125 


784CIP2E 23 


9065 1 


1768 


3554 


5340 


7126 


784CIP2E 24 


9324 


1769 


3555 


5341 


7127 


784CIP2F 1 


2976 


1770 


3556 


5342 


7128 


784CIP2F_2 


3 559 


1771 


3557 


5343 


7129 


784CIP2F 3 


4021 


1772 


3558 


5344 


713 0 


784CIP2F 4 


4474 


1773 


3559 


5345 


7131 


784CIP2F 5 


4566 


1774 


3560 


5346 


7132 


784CIP2F 6 


4705 


1775 


3561 I 


5347 


7133 ! 


784CIP2F_7 


4707 


1776 


3562 


5348 


7134 


784CIP2F 8 


4712 


1777 


3563 


5349 


7135 


784CIP2F 9 


5008 


1778 


3564 


5350 


7136 


784CIP2F 10 


5009 " 


1779 


3565 


5351 


7137 


784CIP2F 11 


5015 


1780 


3566 ] 


S352 


7138 


7B4CIP2F 12 


5015 


1781 


35^7 


5353 


'"' 713 9 


784CIP2F 13 


7724 


1782 


3568 


5354 


7140 


784CIP2F 14 


7725 


1783 


3569 


5355 


7141 


784CIP2F 15 


8828 


1784 


3570 


5356 


7142 


784CIP2F 16 


8830 


1785 


3571 


5357 


7143 


784CIP2F 17 


9739 


" 178<J 


3572 


5358 


7144 


764CIP2F 18 


9896 



TRADOCS: 14 16247.1 (%CS701 l.DOC) 
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TABLE 7 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L»Lcucine ; M«Methionine , N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T«Threonine r V«Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSALILDEVAILPAPQNLSVLSTNMKHLLMWSPVIAPG 
ETVYY SVEYQGE YBSLYTSHIW I PS SWCSLTEGP3CDVTDD ITA 
TVPYNLRVRATLGSQTS /CLEHP/VS I PLIETQPSLPDL/RMEI 
TKDGFHLVIEliEDLGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 
VHLETMEPGAAY CVKAQTFVKA IGRYSAFS QTECVEVQGEAIPI* 
VLALFAFVGFMLILVWPLFVWKMGRLLQ/YLLLPRGGSSQTPW 
KITQF 


5360 


2 


1115 


PR VRSSGGQEDPASQQWARPRFTQP SKMRRRVI ARPVGS S VRLK 
CVAS GHPRPDI TWMKDDQALrRPEAAEPRKKKWTLSIiKNIiRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 

FRS AFLTVLPDPKPPGPPVASSSSATS LPW PWI G 1 PAGAVFIL 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALSAGPGVGLCEEHGS PAAPQHLLGPGPVAGP KLYPKLYTGHS 
TPHTYTHPPPSCQliNSSHS 


5361 


3 


925 


HEGSISSANILLDDQFQPKLTDFAMAHFRSHLEHQSCTINMTSS 
SSKEI*W YMPEE YI RQGKLS IKTDVYS FGI V IMBVLTGCR WLDD 
P KH I QLR DLiLR E LME KRGLDS CLS FLDKKVP PCP RNFS AKL FCL 
AGRCAATRAKLR PSMDE VLNTLES TQAS L YFAEDPPTSLKS FRC 
PSPLFLENVPSI PVEDDBSQHNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LRP YKVN IDP S S E APGHSCRS R P VES S CSSKFS WDE YEQ YKKE 


5352 


2 


4879 


S CQ VEGCTRT YNS S QS IG KHMKTAHPD Q YAA FKMQRKS KKGQXA ' 
NNLKTPNNGKFVYFLPSPVNSSNPFFTSQTKANGNPACSAQIiQH 
VS P P I FP AULAS VSTPLLS SM ES V I N PN I TSQDKN2QGGMLCS Q 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPSPADSGTNSVFSQLENNTNHYSSQIEGNTNSSFLKGGNGENA 
VFPSQVNVANKFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNTK 
RAKWPAIIRDGKFICSRCYRAFTNPRSI1GGHI/SKRSYCKPI1DGA 
EIAQELIiQSNGQPSLIASWlLSTNAVNIiQQPQQSTFNPEACFKD 
PS FLQLLAENRSPAFLPNTFPRSGVTNFNTS VSQEGSE I 1 1 QAL 
ETAG IPS T FEGAEMLS HVSTGCVSDAS QVNATVMPNPTVPP LLH 
TVCHPNTLLTNQNRTSNS KTSS IEECSSLPVFPTNDLLLKTVEN 
GhCSS SFPNSGGPSQNFTSN5SRVSVI SGPQNTRSSHLNKKGNS 
AS KRR KKVAPP L I APNASQNLVTSDLTTMGL IAKS VE I PTTNLH 
SNVIPTCEPQSLVENLTQKIiNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDSQMMALNSCTTSVNSDLQISEDNVIQNFEKT 
LEI IKTAMNSQILEVKSGS QGAGE TSQNAQINYN I QL PS VNT VQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGL 
QKLKLENDLSTPASQCVLINTS VTLTPTPVKSTADI TVI QPVSE 
MINIQFNDKVNKPFVCQNQGCNYSAMTKDALFKHYGKIHQYTPE 
M I L E I KKNQLKFAP F KC WPTCT KTFTRNSNLRAHCQL VHH FTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALEI.RAETQNTHSNVAVI PEKQLIEKKS PDKTESSLQ VITVTS 
EQCWTOALTNTQTKGRKIRRHKKEKEEKKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDLPAFSAEVEE 
ESEAGKESEETETKQTLKEFRCQVSDCSR I FQAI TGLIQHYMKL 
HEMTPEE IES MTAS VDVGKFPCDQLECKSS FTTYLNYWHLEAD 
HGIGLRASKTEEDGVYKCDCEGCDRIYATRSNLLRHIFWKHNDK 
HKAHL I R PRRLT PGQENMSS KANQ E KS KS KHRGT KHS RCGKEG I 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALS ECTS RFVTQ Y P CM I KGCTS WTS E SNI I RH Y KCHKLS KAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTAT VSQKE VEKNE* DEMDELTELFITKL INEDSTS VETQA 
NTSSNV SNDFQEDNLCQS ERQKASNLKRVNKEKNVS QNKKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEBHPASFDWSSFKPMGFE 
VSFLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKN1DKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
<A-Alanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G -Glycine, 
H=Histidine, Ialsoleucine, K=Lysine, 
L= Leucine, M=Methionane, NsAsparagine, 
P-Proline , Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine , VsValine, 
WsTryptophan, Y=Tyrcsine, X=Unknown r *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VliKQLQEMKPTVSLKKLBVHSNDPDMSVMKDISIGKATGRGQY 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQP PGGIRRDFSRRLRREANLVATCLPVRAS LPHRLNML 
RGPG PGL L L LAVLC LGT A VPS TG AS KS KRQAQQM VQ PQS PVAVS 
QS KPGCYDNGKHYQ INQQWERT YLGNALVCTC YGGS RGFNCES K 
PE AE ETCFD K YTGNT YR VGDT YERP KDS MI WDCTC I GAGRGR I S 
CTlANRCHEGGQSYKIGDTWRRPHErGGYMLECVCIiGNGKGElfT 
CKPIAEKCFDHAAGTSYWGETWBKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DS G WYS VGMQ lA * KTQGNKQML \CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVL VQTRGGNS NGALCHFPFL YNNHNYTDCTS EGRR 
DNMKWCX3TTQNYDADQKFGFCPMAAHEE I CTTNEGVMYR IGDQW 
DKQHDMGHMMRCTC VGNGRGEWTCIAYSQLRDQCI VDD I TYNVN 
DT FHKRHEEGHMLNCTC FGQGRG RWKCD P VDQCQDS ETGT F YQI 
GDSWEKYVHGVRYQCYCYGRG IGEWHCXJPI.QTY PSSSG P VEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

ghlns yti kgl kpg wyegcl i s i qo yghqe vtrfdftttstst 
pvtsnt\ vtgettp fs plvatsesvte1 tas s fws wvsasdtv 
sgfrveyelseegdepqylvlpstatsv\nip\dllpgrkyivn 
vyqi s edgeqs l ilstsqttapdappdptvdqvddts i wrwsr 
pqapitgyrivyspsvegsstelnlpetansvtiisdlqpgvqyn 
iti yaveenqe stpwiqqettgtprsdtvps prdlqfvevtdv 
kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrntf\aen 
tglspgvtyyfkvfavshgreskpltaqqttklNdaptnlqfvn 

ETDS T VL VRWT P PRAQ I TG YRLT VGLTRRGQ PRQYNVG PS VS KY 
PLRNLQPAS E YT VS LVA I KGNQE S PKATGVFTTLQ PGS S I PP YN 
TE VTETT I V I T WTPAPRI GFKLG VRPSQGG E AP R EVTS DS G S I V 
VSGLTPGVEYVYTIQVIiRiX3QERDAP\IVNK\WTPIiSPPTNLH 
LEAN PDTGVL TVS W E R S TTPD I TG YR ITTT PTNGQQGNS LE E W 
HADQSSCTF\DNIjEVPGLEYNVSVYTVKDDKESVPI SDTI IPAV 
P PPTDLRFTN/ I IX3PDTMRVTW \AP P PS I DLTNFLVRYS PVKNE 
GRMLQSLS 1 FFLSDN\ AWLTNLLPGTEYWS VSSVYEQHESTP 
\ LRGRQKTGLD S P \ TG I D FS \D I TA\NS FT \ VHW \ I APRA/ TP I 
TGYRXR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
S I VALNGREES PLIilGQQSTVSDVPRDLE WAATPTSLLI \SWD 
APAVTVRY YRITYGETGGNSPVQE FTVPGSKSTATI SGLKPGVD 
YTI TVYAVTGRGDS PAS SKPISI NYRTE I DKPSQMQVTDVQDNS 
I S VKWLPS S S P VTG YRVTTT \ PKNGPG \ P TKTKTAG PDQTE MT I 
E^LQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGLAFTDV 
DVDS I KIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQXSLRPGSEYWSWAIiHDDMESQPLIGTQSTAIPAPTDLKFT 
Q VTP TSLS AQWTPPNVQLTGYRVRVTPKBKTG PM KE INLAPDS S 

svwsglmvatkyevsvyalkdtltsrpaqgwttlenvspprr 
arvtdatettitiswrtktetitgfqvdavpangcyrpiqrrikp 
dvrsytitglqpgtdykiylytiindnarsspwidastaidaps 

NLR FLATT PNS LL VS WQ P ? RAR I TG YI I KYEKPGS P P RE WPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQ Q P S VGQQM I FE EHGFRRTTP PTTATP IRHR PRP YP PNVGQ E 
ALSQTT I S WAP FQDTS E Y T ISCHP VGTDE EPLQ FRVPGTS TS AT 
LTGLTRG^TYNIIVEALKDGXJRHKVREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
S S RWCHDNGVN YKIGE KWDRQGENGQMMS CTCLGNGKGE FKCDP 
HBATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGE PS PEGTTGQS YNQYSQRYHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANIiVATCLPVRASLPHRLNML 
RGPGPGLLiLLAVLCIjGTAVPSTGASKS KRQAQQM VQ PQS PVAVS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine r K= Lysine, 
LaLeucine, M=Methionine, N=*Asparagine , 
P=Proline, Q»Glutamine, R=Arginine, 
S»Serine, T»Threonine, V^Valine, 
WsTryptophan, Y=Tyrosine, X*=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QS KPGCYDNGKH YQ INQQWERTYUSNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CT I ANRCHEGGQS YK I GDTWRR PHETGG YMLECVCLGNG KGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQC I CTGNGRG 
EWKCERHTS VQTTS SGSGP FTDVRAAVYQPQPHPQPPP YGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GNS WGE P CVL P FTYNGRT F YS CTTEG RQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWOGTTQN YDADQKFGFCPMAAH3E I CTTNEGVM YRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQLRDQCI VDD I TYNVN 
DTFHKRHEE GHMLNCT C FGQGRGRWKCDP VDQCQDS ETGTFYQ I 
GD S WE KYVHG VR YQC Y CYGRG I GEWHCQPLQT Y PS SSGP VE VF I 
TETPSQPNSHPIQWNAPQPSH1SKYILRWRPKNSVGRWKEATIP- 
GHLNS YTI KGLKPG WYEGQLI S I QQ YGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFS PLVATSES VTEITASS FWSWVSAS DTV 
SGFR VE YELS EEGDEPQ YLVLPSTATS V\NI P \ DiLPGRKY I VN 
VYQI S EDGEQSLILSTSQTTAPDAP PDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTIiSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTr MWTPPESAVTGYRVDVI PVNI» PGEHGQRLPLS RNT F \ AEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPASE YTVSLVAI KGNQE S P KATGVFTTLQ PGS S I P P YN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\lVNK\VVTPLSPPTNIiH 
LEANPDTGVLTVSWERSTTP Dl TGYRITTTPTNGQQGNSLEE VV 
HADQS S CTF\ DNIiEVPGLE YNVS VYTVKDDKE S VP I SDT 1 1 PAV 
PPPTDLRFTN /I LGPDTMRVTW\APPP S IDLTNFLVRYSP VKNE 
GRMLQSLS ifflsdn\a wltnllpgts YWS VSS VYEQHES TP 
\LRGRQKTGUDSP\TGIDFS\DITA\WSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLT»LTPGTEYVV 
SlVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLlXSWD 
APAVTVR Y YR I T YGETGGNS PVQEFT VPGS KS TATI S GLKPG VD 
YTITVYAVTGRGDSPASSKPIS INYRTE XDKPSQMQVTDVQDNS 
I SVKW LPS SS P VTGYRVTTT\ PKNGPG\ PTKTKTAGPDQTEMTI 
EGLQP TVE YWS VYAQNP SGESQPLVQTAVTNIDR P KGUAFTD V 
DVDS I KI AWES PQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S VWSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVS PPRR 
ARVTDATETT I T I S WRTKTET I TGFQVD AVPANGQT P I QRT I KP 
DVRSYTITGLQPGTDYKIYIiYTLNDNARSSPWI DAS TAI DAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLE PGTEYTI YVIALKNNQKSEPHGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTT I SWAP FQDTSE Y 1 1 S CHPVGTDE EPLQFRVPGTSTSAT 
LTGLTRGATYNI IVEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATC YDDGKT Y H VGEOWO KE YLG AT C S CTC FGGORGWP CT\nr*V> 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTKVNCP I ECFMPLDVQ 
ADREDSRE 




80££ 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLC I PS VPPP VP FPTLW P 
PPSWRRQPPGGiRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGP GLLLLAVLCLGTAVPS TGASKS KRQAQQMVQPQS pvavs 
QS KPGCYDNGKH YQ INQ QW B RT YLGNALVCTC YGGS RGFNCES K 
PEAEETCFDKYTGNTYRVGDT YERPKDSMI WDCTC IGAGRGRI S 
CTI ANRCHEGGQS YKIGDTWRRPHETGG YMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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ID 
NO: 


fxeaiccea 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L^Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q-Glutamine , R=Arginine, 
S=Serine, ^Threonine, V* Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








ITCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNLbQC t d*GNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQMI>\ CTCLGNGVS CQETAVTQT YG 
GMSNGKPCVLPPTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGAIiCHFPFLYNNHNYTDCTSEGRR 
DNM KWCGTTQN YDADQKFG FCPMAAHEE I CTTNBGVM YR I GDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCX3PLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YT I KGL KPGWYEG QLI S I QQ YGHQE VTR FDFTTTSTS T 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFR VE YELSEEGDEPQ YliVLPS TATS V\NI P \DLI*PGRKYI VN 
VYQ I S EDGEQSLI LSTS QTTAPDAPPDPTVDQ VDDTS I WRWSR 
PQAP I TGYRIVYS PS VEGS S TE LNLPETANS VTL3 DLQPGVQ YN 
I TI YA VEENQES TP WI QQETTGTPR S DT VPS PRDLQ FVE VTD V 
KVTI MWTPPESAVTG YRVDVI PVNLPGEHGQRLPLSRNTF\AEN 
TGLS PGVTY YFKVFAVSHGRESKPLTAQQTTKL\ DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPAS E YTVSLVAI KGXQES PKATGVFTTLQPGS S I PP YN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGBAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 
LEANP DTG VLTVS WERSTT PD I TG YRI TTT PTNGQQGNSLEEW 
HADQSSCTF\DNLRVPGLEY>JVSVYTVKDDKESVPISDTIIPAV 
PPPTDLRFTN/ ILGPDTMRVTW\APPPSIDLTNFI*VRYS PVJQJE 
GRMLQS LS I FFLSDN\AWIiTNLLPGT3 YWS VS S VYEQHESTP 
\LRGRQFCTGLDSP\rGIDFS\DITA\NSFT\VHW\lAPRA/TPI 
TG YRIR\ HHPEnF \ SGR PREDR\ VPHSRNSITLTNIiTPGTE YW 
SIVALNGREES PLLIGQQSTVSDVPRDIiEWAATPTSLL I \SWD 
APAVTVRYYRI T YGETGGNS P VQEFTVPGSKSTATI SGLKPGVD 
YTI TVYA VTGRGDS PAS S KP IS IN YRTE I DK PS QMQVTD VQDNS 
ISVKWLPSSS PVTGYRVTTT\ PKNGPG \ PTKTKTAGPDQTEMT I 
EGLQ PTVE YWS VYAQNP S GE SQPLVQTAVTNI DRP KGLAFTD V 
DVDS1KIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSE YT VS WALHDDMEBQPL I GTQSTAI PAPTDLKFT 
QVTPTSLSAQKTPPNVQLTGYRVRVTPKEKTGPMKEINIiAPDSS 
S VWS G LMVAT K YE VS VYALKDT LTS R P AQG WTTL ENVS P PRR 
ARVTDATETTITI S WRTKTET I TGFQVDAVP ANGQTP IQRTIKP 
DVRS YT ITGLQPGTDYKI YLYTLNDNARSS P WI DASTAIDAPS 
NLR F LATTPNS LLVS WQ P PRAR I TGY I I KYE KPGS P PRE WPRP 
RPGVTEATITGLEPGTEYTIYVrALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQRHKVREEWTVGNSVNEGLNQPT 
JDDSCFDPYTVSKYAVGDEWERMSESGFKLIjCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
AD REDS RE 




8066 


703 

1 
< 


RIiCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP " 
PPSWRRQPPGGIRRDFSRRl.RREANr,\/aTrT,PVPnQT DUDrnur 

RG PGPGLLLLA VLCLGTAVPSTGAS KS KRQAQQMVQ PQS PVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSM1WDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTMRRPHETGGYMLECVCLGNGKGEWT 
CKP I AEKCFDHAAGTS YWGETWEKP YQG WMMVDCTCLG EGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKW>NRGJTX>IjQCICTGNGRG 
E WKCERHTS VQTTS SGSG P FTDVRAAVYQPQ PH PQPP P YGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
3NSNGEPCVLPFTYNGRTFYSCTTEGRQDGHI*WCSTTSNYEQDQ 
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SEQ 
ID 

.HO: 


beginning 
nucleotide 
location 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


f reaiccea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~ 
<A=Alanine, 0=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
L*Leucine, M=sMethionine, N^Asparagine , 
Pa Proline, Q»Glut amine, R~Arginine, 
S»»Serine, T -Threonine, V=Valine, 
W«=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








KySFCTDHTVLVQTRGGNSNGALCHFPJFLYNNH^YTDCTSEGRR 
DNMKWCGTTQNYDADQK17GFCPMAAHEE I CTTNEGVM YR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVR YQCY C YGRG I GE WHCQPLQT YPS S SG P VEVF I 
TE TPSQPNS H P I Q WNAPQ PSH I S KY I LR WR P KNS VGRW KE AT I P 
GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 

pvtsntWtgettpfsplvatsesvteitassfwswvsasdtv 

SGFRVEYEI^EEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

vyqisedgeqslilstsqttapdappdptvdqvddtsiwrwsr 
pqapitg yr 1 vys ps vegs stelnlp3tans vtlsdlq pgvqyn 
itiyaveenqestpwiqqettgtprsdtvpsprdlqfvevtdv 
kvtimwtppesavtgyrvbvipvnlpgehgqrlplsrntfXaen 

TG LS PG VTYY FKVFAVSHGRE S K PLTAQQTTKJj\ DAP TNLQ FVN 

etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlq pas b ytvslvai kgnqes pkatgvfttlqpgss i p pyn 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLT?GVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 
LEAN P DTG VLTVS WERS TTPD ITG YR I TTT PTNGQQGNSLEEW 
I IADQS S GTF\ DNLEVPGLE YNVS VYTVKDDKESVPlSDTI I PAV 
PP P TDLRFTN / 1 LGPDTMR VTW \ AP PP S I DLTNFL VR YS P VKNE 
GRMLQ5LS I FFI>SDN\ AWLTNLLPGTE YWSVSS VYEQHE STP 
\ LRGRQKTGLDS P \TG I DFS \DITA\NSFT\ VHW \ IAPRA/TPI 
TGYRI R\HHPEHF\SGRPREDR\VPHSRNS ITLTNLTPGTEYW 
SIVALKGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 
APAVTVR YYR X T YGETGGNS P VQEFTVPGS KS TATISGLKPG VD 
YTI TVYAVTGRGDS PAS S KP I S IN YRTE I DKPS QMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDOTEMTI 
EGLQPT VE YWS VYAQNPSGESQ PLVQ TAVTN I DRPKGLAFTD V 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHBLFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S WVSGLMVATKYEVS VYALKDTLTS RPAQG WTTLENVS PPRR 
ARVTDATETT I T I S WRTKTET ITGFQ VDAVPANGQTPI QRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPVVIDASTAIDAPS 
NLRFLATTPNSIiLVSWQPPRARITGYIIKYEKPGSPPREVVPRP 
RPGVTEATITGIiEPGTEYTI YVIALKNNQKSEPIjIGRKKTDEIj P 

qlvtlphpniihgpe ildvpstvqktp fvthpgydtgngiqlpgt 
sgqqpsvgqqmifeehgfrrttppttatpirhrprpyppnvgqe 
alsqttiswapfqdtseyiischpvgtdeeplqfrvpgtstsat 

LTGLTRGATYNI I VEALKDQQRHKVRBEWTVGNS VNEGLNQPT 

ddscfdpytvskyavgdewermsesgfkllcqclgfgsghfrcd 
ss rwchdng vny ki gekwd rqgengqmms ctclgngkge fkcdp 

HEATCYDDGKTYHVGEQWQKEYliGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
AD REDS RE 


5367 ~ 


235 


3591 

] 
3 


KKILNMLCKKNIVIEYLAblLYEYLYGFCFSGIKKYLIIHVLRL 
I LELWMTRLLLE KS VS LQTQ YLLL I VKI LS WFPGKEMRHHLQ IM 
EVMMRKQDS / RIVGNGS EQQLQKELADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPEMSLPVKPGQ 
GDSEASSPFTPVADEDSWFSKIiTYIiGCASVNAPRSEVEALRMM 
SILRSQCQISLDVTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
L FCVRGHDGT PES DCFAFTES H YNAEL FR I HVFRCE I Q EAVS R I 
LYS FATAFRRS AKQTP LSATAAPQTPDSD IFTFS VSLE I KEDDG 
KG YFSAVPKDKDRQCFKXiRQG IDKKI VI YVQQTTNKELAI ERCF 
GLLLSPGKDVRNSDMHI.LDLESMGKSSDGKSYVITGSWNPKSPH 
FQWNEET PKDKVL FMTTAVDLVI TE VQEP VRFLLET KVRVCS P 
NERLFWPF S KRSTTENFFDKLKQI KQRER KNNTDTI* YE WCLBS 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEEEDNDEPLL 
SGSGDVSKECAEKlLETWGELl>SKWHLNLNVRPKQLSSL\mNGV 
PEALRGEVWQIiLAGCHNNDHLVEKYRIIilTKESPQDSAITRDIN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(As Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, Idsoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
. P^Proline , Q=Glut amine f R=Arginine , 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTF PAHD Y F KDTGGDGQD S LY KI C KAYS VYDEE IGYCQGQS FLA 
AVLLLHMPEEQAFSVLVKIMFDYGLRELFKQNFEDLHCKFYQLE 
RLMQEYI PDLYNHFIiD IS LEAHM YASQWFLTLFTAKFPL YMVFH 
IIDLIiLCEGISVIFNVALGLLKTSKDDLLJ.TDFEGALKFFRVQL 
PKRYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDP I ERFERENRRLQEANMRLEQENDDLAHELVTS KIALRKDLD 
NAEEKADALNK3LLMTKQKLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKG I S STKEVLDEDTDEEKETLKNQL 
REMELELAQTKL\Q L VE A SCKI QD \ LEHP F * GLPFNE \ VQAA\ K 
KTWFNRTLS S I XTATG VQGKETC 


5368 


573 


2014 


GAAAG AAD P RRGS LGGRTMLD F A I FAVTFL LAL VGAVL YL YPAS 
RQAAGI PGITP TEEKDGNLPD I WSGSLHE FLVNLHER YG P WS 
FWFGRRLWSLGTVDVLKQHINPNKTLD/LF*NHAEVI IKVSIW 
WWQCE*KP\QRKKLYENGVTDSLKSNFALIiLKLPEELLDKWIiSY 
PETQH\VPI,SQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSEIGKGFLDGSLDKKMTRKKQYEDALMQLES VLRNI I KERK 
GRNFSQHI F IDS L VQGNLNDQQ ILED3M 1 FSLASCI ITAKLCTW 
AI WFLTTSEEVQKKL YEE I NQ VFX3NGPVTPEKIEQLR YCQH VLC 
ETVRTAKLTPVS AQLQDI EGKIDRF 1 1 PRETLVLYALG WLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGFSGTQECPBLRFAYMVT 
TVLLS VLVKRUHLLSVEGQVIETKYELVTS SREEAWITVS KRY 


5369 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLl^GTMSASFVPNGASLED 
CHCNL FCLADLTG I KWKKYVWQG PTS AP I LF PVTEEDP I LS S FS 
RCLKADVLG/VWRRDQRPERRE\L* IFWGGEDP\ VLLTLFTMTY 
QKKKMECXSRMDFPMNAVIjCFSKAVHNIjLERCLMNRNFVRIGKWF 
VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 
LLSEEHITLAQQSNS PFQVI LCPFGLNGTLTGQAFKMSDSATKK 
LIGEWKQFYPISCCLKEMSEEKQEDMDWEDDSLAAVEVLVAGVR 
MI Y PACFVLVPQSDI PTPSPVGSTHCSSSCLGVHQVPASTRDPA 

MS S VTLTPPTS p e evqtvd pqsvq kwvkfs s vsdgfnsds ts hh 

GGKI PRKIiANHWDRVWQECNMNRAQNECRKYSASSGGLCEEATA 

AKVASWDFVEATQRTNCSCLRHKNLKSRNAGQQGQAPSLGQQQQ 

ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 

SQRLV\ISAP\DSQ\VRFSNIR\TNDVAK\TPQMHGTEMANSPQ 

PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 

EDRIDSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 

PLKV S DELVQO YQ I KNQCLSAIASDAEQEPKIDP YAFVEGDEEF 

LFPDKKDRQNS EREAG KKHKVEDGTS S VTVLSHEEDAMSLFS PS 

IKQDAPRPTSKARPPSTSLIYDSDLAVSYTDLDNLFNSDEDELT 

PGSKRSANGSDDKASCKESKTGNUDPLSCISTADLHKMYPTPPS 

LEQKIMGFSPMNMNNKEYGSMDTTPGGTVLEGNSSSIGAQFKIE 

VDEGFCSPKPSEIKDFSYVYKPENCQILVGCSMFAPLKTLPSQY 

LPLIKLPEECIYRQSWTVGKLELLSSGPSMPPIKEGDGSNMDQ.E 

YGTAYTPQTHTSCGMPP5SAPPSNSGAGILPSPSTPRFPTPRTP 

RTPRTPRGAGGPASAQGSVKYENSDLYSPASTPSTCRPLNSVEP 

ATVPSIPEAHSLYVNLILSESVMNLFKDCNSDSCCICVCKMNIK 

GADVGVYIPDPTQEAQYRCTCGFSAVMNRKFGNNSGLFFEDELD 

I IGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDL ILL 

LQDQCTNI,FSrFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 

LE HGRQ FMDNMS GGKVDEALVKS SCLH P WS KRND VSMQCSQD I L 

RMLLS LQP V LQDAI QKKRT VR P WGVQG PLTWQQ FH KMAGRGS YG 

TDES PEPLP I PTFLLG YDYD YL VLS P F ALP YWERLMLE P YGS QR 

D IAYWLCPBNEALLNGAKS FFRDLTAI YESCRLGQHRPVSRLL 

TDCSIMRVGSTASKKLSEKLVAEWFSQAADGNNEAFSiCLKLYAQV 

CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 

NT PSATLAS AAS STMTVTSGVAI S TS VATANSTLTTASTSS S S S 

SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 

QTSALQTAGISGBSSS LPTQPHPDVSESTMDRDKVGI PTDGDSH 

AVTYPPAIWYIIDPFTYENTDESTNSSSVWTLGLLRCFLEMVQ 
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SE6 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G«=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N*Asparagine , 
P= Proline, Q^Glut amine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=:Stop 
Codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion) 








TLPPHIKSTVSVQIIPCQYLLQPVKHEDREIYPQHLKSLAFSAF 
TQCRRPLPTSTNVKTLTGFGPGLAMETALRSPDRPECIRLYAPP 
Fl LAP VKDKQTELGETFGKAGQKYNVLFVG YCLSHDQRWI LAS C 
TDLYG&LLETCIINIDVPNRARRKKSSARKFGLQKLWEWCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCGI SAADS PS ILSACLVAME PQGSFVIM PDS VSTGS VFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
AFNPNNDGADGMGIFDLLDTGDDLDPD1 IN r LPAS PTGSPVHS P 
GS HY PHGGDAG KGQS TDRLLS TE PHEEV PN I LQQPLALGYFVST 
AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLHS 
KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
FWLNQTjYNFIMNMTj 


S370 


1226 


716 


RWSRKLELRRAAQATB S RP PQSQEMHPPTGKEVHALKRLRDSAN 
AND VET VQQLLEDGADPCAADDKGRTALHFASCNGNDQ I VQLLL 
DHGADPNQRDG^GNTPLHIiAACTNHVPVITTIjLRGGARVDALDR 
AGRTPLHLAKSKLNILQEGHAQCLKAVR/HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSLALAESLSLFRACTSLPVG 
GCISWL 


5371 


1331 


1*7 


1 AAMLWKLLLRSQS CRLCS FRKMRS P PKYRPFIACFTYTTDkQS~ 
S KENTRTVEKLYKCS VD IRKIRR \ * KDGYF* RMKPMLKKLRI / P 
LQELG ADETAVAS I L ERCP E A I VCS P TAVNTQRKLWQLVCKNEE 
EL I KL I EQFPESFFT I KDQSNQKLNVQFFQE LGLKNWI SRLLT 
AAPNVFHNPVEKNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
NPFI LLNSPTAI KETLEFLQEQGFTS FE I LQLLSKLKGFLFQLC 
PRS IQNS IS FS KNAFKCTDHDLKQLVLKCPALLY YSVPVLEERM 
QGLLREGIS I AQI RETPMVLELTPQ I VQ YRI RKLNSSGYR I KDG 
HLANLNGSKKEFEANFGKIQAKKVRPLFNPVAPLNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLILLFVTELSGAHNTTVFQGVAGQSLQVSCPYDSMKHWGR 
RKAWCRQLGEKG P CQR WS THNLWLLS FLRRWNGSTAITDDTLG 
GTLT I TLRNLQP HDAGLYQ CQS LHGS E ADTLRKVLVE VLAD PLD 
HRDAGDLWFPG\DLRASRM?MWSTASPGASWKEKSPSHPLPSFS 
S W PAS FSSRF * Q PAPSGLQPGMDRS QG H I HPVNWTVAMTQG I SS 
KLCQG 


5373 


2814 


346 


VKKTKSIFNSAMQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY^ 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRR I S LSDM PR S PMS TNS S VHTGS DVEQDAE KKATS SHFS ASE 
ESMDFLDKSTAS PASTKTGQAGSLSGSPKPFS PQLSAPITTKTD 
KTSTTGS ILNLNLDRSKAEMDLKELSES VQQQSTP VPLIS PKRQ 
IRSRFQLNLDKT IESCKAQLGINE I SEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKE PS PKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TSAAGATATTS TS ST VTVTAPA PAATGS PVKKQR PLLPKE \ TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YNDLS KN\TTW KAQLAEDSQGLR I E I E KLQ W LHQQ EL \ S ENKHN 
LELTi^^RQSWEQBRDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKSAI F YCCWNTS YCD YPCQ\ QAHWPEH\ MKS CTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSBTASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS \TRSDHN/TPSTQHGRSLL PGKESRAGTP FLGTS K 


"5374 


2814 ■' 


346 


VKKTKS t FNS AMQEME VYVEN I RR KFG VFN YS P FRTP YTPN& Q Y 
QMLLDPTNPSAGTAK IDKQEKVKLNFDMTASPK I LMSKP VLS GG 
TGRRI SLSDMPRS PMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTAS PASTKTGQAGSLSGSPKP PS PQLSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
I RS RFQLNLDKTI ESCKAQLGINE IS ED VYTA VEHSDS EDS EKS 
DSSDSE YI SDDEQKS *GTSQEDTEDKEGGQMDKE PSAVKKKPKP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine , M=»Methionine , N=Asparagine , 
P»Proline, Q«Glutamine, R=Arginine, 
S= Serine, Ta Threonine, VWValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion} 








TNP VEI KEELKSTS PAS EKADPGAVKDKASPEPEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKE PS PKQDWGKTPPSTTVGSHS P PETPVLTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSFVKKQRPLLPKE\TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITIjVTSTQSSPLVTSSGSM 
STLVS S VNGDL P IGTAS ADVAADI AKYTS Kl» \MDAI KGTM \ TE I 
YNDL S KN\ TT W KAQLAEDS QGLRI E I EKLQWIiHQQE L\S EMKHN 
LELTt^AEMRQSWEQERDRLlAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIPycCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILI,GSNQGSDHSR\SNKSSWSSS 
DE KRG S \TRSDHN/TPSTQHGRSLLPGKESRAGT PFLGTS K 


5375 


2907 


1116 


HIFIiAEEEPMLERRCRGPLAMGPAQPRLLSGPSQESP^lbGKES 
RGriRQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\LWLHTR 
RCQA/RGLPL PCPECGRRFRHAP FLAI*HRQVKAAATPDWGFACH 
LCGQSFT?GWVAXjVLHLRAHSAAKAGP?ACPKMARDAFWRRKAAS 
SSILRRCHPSRPRGPRPFICGNCGRSILPTWDQ/IiKVAHKRVHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKRFRHK\PNLrRSHAACrSGERPHQ/CSRECG\KRFTNKPY 
LT S \HRRI THTARQP Y P CKE CGRR FRHKPNLLS H S K I H KRS EGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDP IEAP PSL YS CDDCGRS FRIiERFLRAHQRQHTGERP FTCAEC 
GKNFGKKTHLVAHSRVHSGERPFRLARKCGRRFLPRASQSGGRN 
SAEPNAPRFG PFVCPDCGKAFRHKP YLAAHRP I ATPAEKP YVCP 
DCRKAFSQKSNL\VSHRRIHTGERPYACPDCDRSFSQKSNLITH 
RKSHIRDGAFCCAICGQTFDDBERLIiAHQKKHDV 


5376 


4504 


591 


VS T FS LCLWPAGGGGRGR VSNMAQS KRHVYSRTP SGSRMSAEAS " 
AR P JjR VGSR V E V 1 GKGHRGT VAY VG ATLFATGKW VG V I LD E AKG 
KNDGTVQGRKYFTCDEGHGIFVRQSQIQVFEDGADTTSPETPDS 
S AS KVLKREGTDTTAKTS KLRGLKP KKAPTAR KTTTRRP KPTRP 
ASTGVAGASS SLGPSGSASAGELSS SE PSTPAQTPLAAP 1 1 PTP 
VLTS PGAVPPIiPS PSKEEEGLRAQVRDLEE KLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKS KMQBQQADLQRRLKEARKEAKEAL 
EAKERYMEEMADTADAIEMATLDKEMAEERAESIiQQEVEALKER 
VDELTTDLEI LKAEIEEKGSDGAAS S YQLKQLEEQN AR LKDALV 
RMRDL S S S EKQEHVK\LQKLMEKKNQEIjEWRQQRE RLQEELS Q 
AESTIDELKEQVDAAXGAEEMVEMLTDRNLNLEEKVRELRETVG 
DLEAMNEMNDBLQENARETEIiEIiREQLDMAGARVREAQKRVEAA 
QETVADYQQTIKKYRQLTAHLQDVNRELTNQQEASVERQQQPPP 
ETFD FKI KFAETKAHAKAI EMELRQME VAQANRHMS LLTAFMPD 
SFLRPGGDHDCVLVLIiLMPRLICKAEL I RKQAQEKFELSENCSE 
RPGLRGAAGEQLS FAAIGLVY\SLMPAAGHRYHRY* CHALSQCR 
LD\ VYKKVGSLYPEMSAH ERSLDFLI EIiLHKDQLDETVNVEPIjT 
KAIKYYQHLYS IHLAEQPEDCTMQLADHI KFTQSALDCMS VE VG 
RLRAFLQGGQEATDIALLLRDLETSCS \ DI RQFCKKIRRRMPGT 
DAPGI PAALAFGPQVSDTLIjDCRKHLTWWAVLOE VAAAAAQLI 
APLAENEG L L VAAL E ELAFKAS EQ I YGT PSSS P YECLRQS CNI L 
ISTMNK\LVTAMQEGEYnAERPPSKPPP\VELRAAALRAEITDA 
EGLGLKIiE DRETV I KELKKSLKIKGEELSEANVRliTLLEKKLDS 
AAKDADERIEKVQTRUEETQALLRKKEKEFEETMDAIjQADIDQL 
EAEKAELKQRLNSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
GAIPGQAPGS VPGPGLVKDSPLLIiQQI S AMRIiH I SQLQHENS I L 
KGAQMKASLASLPPLHVAKIiSHEGPGSELPAGALYRKTSQLljET 
LNQLSTHTHWDITRTS PAAKS PSAQLMEQVAQLKS LSDTVEKL 
KDEVLKETVSQR PGAT VPTDFATFPS S AFbRAKE EQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQLHQLHSRtilS 


5377 




1106 


dvpckrvlpaeaqekgqltls cgesgeeg \f* yhevrqaeges * 
/wfgpnvrlvhtqi,ktkkpsgtlkakfylhtgstkfaarisct:< 
ss * wpgydgwwggqyi fifrgmrweeqp 


5373 


2009 


664 


Q ASGTTIiRPIiPDL P QLKRREATS RNRAL KPRGRIiVLMTSCL PAX* 
RFIATPRDSAMPHIDNDVKLDFKDVLDRPKRSTLKSRSEVDLTR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, GsGlycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=»Threonine ( V= Valine, 
W»Tryptophan, Y~Tyrosine, X= Unknown, +=Stop 
Codon, /s=possible nucleotide deletion, 
\=spossible nucleotide insertion} 








sfsfrnskqtysgvpiiaanmdtvgtfemakvlcks*vpgsfwd 
vpqmgcwliyklftlkwkmlllsvllpasilvaekfslptavh 
khyslvqwqefagqnpdcliehlaassgtgssdfeqleqilealp 
qvkyicldvangysbhfvefvkdvrkrfpchrimagnvvtgemv 
eelilsgadi i kvg i g pgs vcttrkktgvg ypqlsavme cadaa 
hglkgh 1 1 s dggcs c pgd vakafg agadf vmlggmlaghs esgg 
elierdgkkyklfygmss*i\amVkkyaggvaeyrasegktvev 
pfkgd vehti rdi lgg i rstct yvgaaicl kelsrrttfi r vtqq 
vnpifseac 


5379 


2009 


664 


QASGTTLR PLPDLPQLKRREATS RNRALKPRGRLVLMTS CLPAL 
RF I ATPRLSAMPHI DNDVKLDFKD VLLRPKRSTLKSRS E VDLTR 
SFSFRNSKQTYSGVPIIAANMDTVGTFEMAKVLCKS*VPGSFWD 
VPQMGCVFL I YKLFTLKWKMLLLS VLLPAS I LVAEKFS LFTAVH 
KHYS LVQWQE FAGQN PDC LE HLAAS SGTGS SDFEQLEQ I LEA I P 
QVK YI CLDVANGYS EHFVEF VKD VRKRFPQHTIMAGNWTGEMV 
EELILSGADI I ICVG IG P GS VCTTR KKTG VG Y PQLSAVMECADAA 
HGLKGHI I SDGGCS CPGDVAKAFGAGADFVMLGGMLAGHSESGG 
ELIERDGKKYKLFYGMSS * I\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5380 


2 


2050 


PSRAGGAERGRAAAARS pggsaagwecpsvldeagactmsscvs 

SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
S F 1 WTECE PGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCI CPSLP YSPVS S PQSS P 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDE IGKGSYGWKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI\EQVYQEIA\ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F \ ELVNQGP VME VPTLKP LS EDQAR F YFQDL I KGIE YLH YQK 1 1 
H \RD I KPSNLLVGEDGHI KIADFG VSNEFKGSDALLSNTVGTPA 
FMAPBSLSE TRKIFSGKALDVWAMG VTLYCFVFG* C P FMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIWPEI 
KLH P WVTRHGA2 PL PS EDENCTLVE VTEE EVENS VKH I P S LATV 
ILVKTMIRKRSFGNPFEGSRRBERSL5APGNLLTKKPTRECESL 
SELKT+KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


5381 


2 


2050 


PSRAGGAE RGRAAAAR S PGGSAAG WE CPS VLDEAGACTMS S CVS 
S Q P S SNRAAPQDELGGRG S S S S E S QK PCEALRGLS SLS I HLGME 
SFIVVTECEPGCAVDIiGlJu^RPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQBRSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDE IGKGSYGWKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGP I \ EQVYQE I A\ ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ ELVNQGPVME VPTLKPLSEDQAR F YFQDLIKGIE YLH YQKI I 
H \ RD I KPSNLL VGEDGH I KIADFGVSNE FKGSDALLSNTVGTPA 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHSKIKSQALEFPDQPD IAEDLKDLITRMLDKNPESRI WPEI 
KLHPW VTRHGAE PLPSEDENCTLVEVTEEE VENSVKH I PSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
S EL KT * KI S P LPACCKVT * EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM • 


5382 


' 1536 ™ " 


203 ■ ■ ■ 


GARGS QQDA PALQEAB VRGPERAQ PARGRMTKARL FRLWLVLGS 
VFMILLIIVYWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE 
LTADS DVDE FLDKFLSAGVKQSDLPRKETEQPPAPGSMEES VRG 
YDWS PRDARRS PDQGRQQAERRS VLRGFCANSSLAFPTKERP FD 
DIPNSELSHLIVDDRHGAIYCYVPKVACTNWKRVMIVLSGSLLH 
RGAPYRDPLRI PREHVHNAS AHLTFNKFWRR YG KLSRHLM KVKL 
KKYTKFLFVRDPFVRLISAFRSKFELENEEF/ * PQVRRAHAAAV 
RQ PHQ PAR LGARGL PR W PQ \ VS FAN F I Q YLLDPHT2KLAP FNEH 
WRQ VYRLCHPCQ I D YDFVGKLETLDEDAAQLLQLLQVDLAAPLP 
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" SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


PrpH i pf-pH an/4 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ] 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
vjxu-amxc aciq, r=pnenyl alanine, G=Glycine, 
H=*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P»Proline, QsGlufc amine, R^Arginine, 
S=Serine, T^Threonine, V»Valine, 
W=Tryptophan, Y«Tyxosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, \ 
\«possible nucleotide insertion) [ 








PELPGTGPPSSWEEDWFAKIPLAWRMI,yia,YEADFVLFfeypKpH 
ENLLRD 1 


5383 


45 


5250 


VERLLGC RNS KRTW RM LIS KNMP WRRkQG I S FGMYSAE ELKKLS " 

VKSITNPRYLDSLGNPSANGLYDLALGPADSKEVCSTCVQDFSN 

CSGHLGHI EL P I/TVYNPLL FD KL YIiLLRG SCLNCHMLT CPRAVI 

HLLliCQLRVLEVGALQAVYELERILSRFLEENADPSASEIREEL 

EQYTTEIVQNNLLGSOX5AHVKNVCESKSKLIALFWKAHMNAJCRC 

PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 

IGKRGYLTPTSAREHLSALWKNEGFFLNYLFSGMDDDGMESRFN 

PSVFFLDFLWPPSRSRPVSRLGDQMFTNGQTVNLQAVMKDWL 

IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSPLSTLPGQ 

SLIDKLYNIWIRLQSHVNlVFDSEMDKLMMDKYPGIRQIIiEKKE 

GLFRKHMMGKRVDYAARSVICPDMYINTNEIGIPMVFATKLTYP 

QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 

QREAVAKQLI*TPATGAPKPQGTK I VCRHVKNGD I LLLNRQPTLH 

RP S I Q AHRAR I L» PE EKVLR LH YANC KAYNAD FDGDEMNAHF PQS 

ELGRAEAYVLACTDQQ YLVPKDGQ PLAGLIQDHM VSGAS MTTRG 

CFFTREHYME LVYRGLTD KVGRVXLL S PS I LKPFPLWTG KQWS 

TLLINIIPEDHIPLNLSGKAKITGKAWVKBTPRSVPGFNPDSMC 

ESQVI IREGELLCGVLDKAHYGSSAYGLVHCCYE I YGGETSGKV 

LTCLARIjFTAYLQLYRGFTLGVEDILVKPKADVKRQRIIEESTH 

CGPQAVRAALNLP EAAS YD3 VRGKWQDAHLGKDQRDFNMI DLKF 

KEEVNHYSNE INKACMPFGLHRQFPENTLQLMVQSGAKGSTVNT 

MQ ISC JtljGQI ELEGRSTPLMASGKSJLPCFEPYEFTPRAGGFVTG 

RFLTG I KPPEFFFHCMAGREGLVDTAVKTSRSG YLQRCI I KHLE 

GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 

SN YE V I MKSQHLHE VL S RADPKKALHH FRA I KKWQS KH PNTLIiR 

RGAFLS YSQKIQEAVKALKLESENRNGR/RPWDS /G/RMLRMWY 

ELDEESRRKYQKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 

ysqewaaqteksyekselsldrlrti»lql\kwqrslcepgeavg 

LLAAQS IGEPSTQMTLNTFHFAGRGEMNVTLGIPRLREILMVAS 
ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 
ESFCMEEKQNKFQVYQLRFOFLPHAYYQQEKCI*RPEDILRFMET 
RFFKLLMES IKKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 
EQEGDEEEBGH I VDAEAEEGDADASDAKRKBKQE EE VDYESEEE 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
PSRP PDAAPETHPQ PGAPGA\ EAMERRVQAVRE I HP FI DDYQ YD 
TEESLWCQVTVKL PLMK I NFDMSSIi WS LAHGAVI YATKG I TRC 
LLNETTNNKNE KELVLNTEG INLPEL FKYAE VLDLRRLYSND IH I 
Al ANT YG I SAALRVI EKEI KDVFAVYG IAVDPRHLSLVAD YMCF 
EGVYKPLNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSHDELR 
S PSACLWGKWRGGTGLFELKQPLR 


5384 


196 


886 


wovAJUKxitr j. vxj"Li»vije , i'ust:ii'c:xi*sijF\PGRPHAIjPEIRPYINt 1 
TILKGDKGDPGPMGLPGYMGREGPQGBPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALESGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPI/RG I YFFSLNVHS WN YKETY VH IMHNQKEAVI L YAQ PS 

ERSIMOSOSVMIiTYTiZlYrtn'BVtjnrDT T?VDADt<Mn Yvrkmnrvmyvnui 1 

SGHLIKAEDD | 


5386' ' 


326 


799 


i,MVPRTKKEAPAPPKABAKAKAl,\KAKKAVLKDVKSHKKNKIHrH 
S PTFRR PKTD* LRRQPKYP WKSTPRRNKLDHHVI I KFPI/TTE * A 

VKKIENNSliIjVFTVDVKANKHQIKQAVKK/LCDIDVAKVNrLIQ 
SDGERKAYVRLiAPDYDALWATKIGIT J 


5386 


326 


799 ' 


LMVPRTKXEAPAPPKAEATOKAIiXKAKKAVLJCDVHSHKraaKlHM '"[ 
S PTFRR P KTI* * LRRQ p K YP WKSTPRRNKLDHHVI I KFPLTTE * A 
VKKIE^SLLVFTVDVTCANKHQIKQAVKK/LCDIDVAKVNTLIQ 
SDGERKAYVRLAPDYDALWATKIGIT | 


53 87 " 


2 


2117 

• 


FWAASGGCWFVLGERRAGSLLSAS YGTFAMPGMVLFGRRWAIA " j 
SDDLVFPGFFELVVRVl>WWIGILTLYLMHRGKItDCAGGALLSSY 
LIVLMILLAWICTVSAIMCVSMRGTICNPGPRKSMSKLLYIRL 
^FFPEWVWASI^AAWADGVQCDRTVVNGriArVVVSWiriAA 
r WS 1 1 1 VFDPLGGKMAP YSaAGPSHLDSHDSSOLLNGLKTAAT 



309 



WO 01/53312 



PCT/US00/34263 



SEQ 

ID. 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cys teine, D=Aspartic Acid, E~ 
Glutamic Acid, F= Phenyl a lani ne , G-Glycine, 
HaHietidine, I-Isoleucine, K*. Lysine, 
L=Leucine , M=Methionine , N»Asparagine , 
P= Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, **stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAFSSTAELFSTYFSDTDLVPSD 
1AAGIALLHQQQDNIRNNQ3PAQWCHAPGSSQEADLDAELKNC 
HHYMQFAAAAYGWPLYIYRNPLTGLCRIGGDCCRSKNPQTMT/M 
VGGDQLQL/CTSAPILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHRKES WVAVRGTMS LQDVhTDhSAES E VLDVECEVQDRLAH 
KG I SQAAR YVYQRL INDG I LSQAFS I APE YRL VI VGHS LG GG AA 
ALIATMVRAAYPQVRCYAFSPPRGLWSKALQEYSQSFIVSLVLG 
KDVIPRLSVTNLEDLKRRILRWAHCNKPKYKILLHGLWYELFG 
GNPNNIiPTEIiDGGDQEVLTQPLLGEQSLLTRWSPAYSFSSDSPL 
DSSPKYPPLYPPGRIIHLQEBGASGRFGCCSAAHYSAKWSHEAE 
FSK I L I GPKMLTDHMPD I LMRALDS WSDRAACVS CPAQGVS 5 V 
DVA 


5388 


1569 


753 


tadggaggggrrqagvrrhylypf^ggVrrrraacqaerpaars " 
kdtdlaayqkgnlg vqlrnmaqe tnhs qvpmlcs tgcg f ygn pr 

TNGMC S VCYKEHLQRQNS SNGR ISP PVQCTDGS VPEAQSALDS T 
SSSMQ PSPVSNQSLLSES VAS SQLDSTS VDKAVPETEDVQAS VS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTQM YT I ALT ITKQM L KNFVFQQE F KS FGS FHQQLLE YK 
ILEHLQTKN 


5389 


1569 


753 


TADGGAGGGGRRQAGVRRHyLVPFTGGYRRRRAACQAERPAARS " 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLC STG CG FYGNPR 
TNGMCS VCYKBHLQRQNSSNGRISPP VQCTDGS VPEAQSALDS T 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTQM YT I ALTI TKQMLKNFVFQQEFKS FGS FHQQLLE Y K 
ILEHLQTKN 


5390 
— cTSi 


217 


1332 


EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDS ITRDDNIAAFKRIRLRPRYLRDVSEVDTRTTIQGEE I 
SAP ICIAPTGFHCLVWPDGEMSTARAAQAA\GICYI TSTFAS CS 
LED I VIAAPEGLR WFQL YVHPDLQLNKQLIQRVESLG FKALV I T 
LDTP VCGNRRHDIRNQLRRNLTLTDLQS PKKGNAI P YFQMTP I S 
TSLCWNDLSWFQSITRLPI 2LKGILTKEDAELAVKHNVQGI IVS 
N HGGRQLDE VLAS I DALTE WAAVKG K I E VYLDGG VRTGNDVLK 
ALALGAXCI FLGDAI LWALAS KGEHG VKE VLN I LTNE FHTS MA \ 
LTGCRS VAE INRUL VQFSRL 




1 


1292 


VJCKAAGRSRGP PTAGGQR CEEAPGTVWERRLG VRAW VKENRGS F 
QPPVCNKLMIIQEQLKVMFVGGPNTRKDYHIEEGEBVFYQLEGDM 
VLRVLEQGKHRD WIRQGE I FLLPARVPHSPQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAP I IQEFFS 
SEQYRTGKPIPDQLLKEPPFPLSTRSIMEPMSLDAWLDSHHREL 
QAGT PLS LFGDT YETQVIAYGQGSSEGLRQNVD W7LWQLEGS SV 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
DPACKKS PWGEPSCHGLKAATGVPSTLEVPSLPNNSPS PH YLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQrQPTAL 
PVLPGGLPPAPLLP I PLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 


IRGSNAQKVVGASGSGGAGPQPDPAGPGGVPALAAAVLGACEPR " 
CAAPCPL PALSRCRGAGSRGSRGGRGAAG SGDAAAAAEW I R KGS 
FIHKPAHGWLHPDARVLGPGVS YWRYMGCIEVLRSMRSLDFNT 
RTQVTREAINRLHEAVPGVRGSWKKKAPNKALASVLGKSNLRFA 
GMSISIHISTDGLSLSVPATRQVIANHHMPSISFASGGDTDMTD 

YVAYVAXDPINORAPHTTil?r , r , l?rtT,\ &r\QTTQ , n/<?rsxvi?i dcv^v 

LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 
GGL VDS RLALTQPCALTALDQG PS PSLRDACS LP WD VGS TGTAP 
PGDGYVQADARGPPDHEEJHLYVNTQGLDAPEPEDSPKKDLFDMR 
P PEDALKLHECS VAAGVTAAPLPLEDQ WPS PPTRRAP VAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFESISHLIDHHLQNGQPIVAAE 
SELHLRGWSREP 


S393 


2 


982 


GGDSAGMTMiSXQMSQNVCPRNLWLLQPLTVLLIJ^ASADSQAAAP 
P KAVLKLE PPWINVLQ\ EDS VTLTCQGAPQP/ERSDS I QWFHNG 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H»Histidine, I-Isoleucine, KssLysine, 
L» Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T»Threonirze, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NL I PTHTQPS \ YRFKANNN\DSGE YTCQTGQTS l\SDPVHIjTV 

lsewlvlqtphlefqegetimlrchs\wrdkp\lvkvtffqngk 
sqkfshiidptfs i pqanhshsgdyhctgn ig ytlfss kpvtitv 
qvpsmgsss pmg 1 i vavvi atavaai vaavvali ycrkkrisan 
stdpvkaaqfeppgrqmiairkrqleetnndyetadggymtlnp 
raptdddkniyltlppndhvnsnn 


S394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLliQPLTVLLHiASADSQAAAP 
PKAVLKLEPPW INVLQ\EDS VTLTCQGAPQP /BRSDQ IQWFHNG 
\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMIiRCHS\WRDKP\LVKVTFFQNGK 
SQKFS HLDPTFS I PQANHSHSGD YHCTGN I G YTL FSS K P VTI TV 
Q VPSMGSSS PMG 1 1 VAW I ATAVAAI VAAWAL I YCRKKRI SAN 
S TDP VKAAQ F E P PGRQM I AI RKRQLE ETNNDY ETADGG YMTLNP 
RAPTDDDKNI YLTIjPPNDHVNSNN 


5395 


3135 


531 


RASDAKNQEGLLNTRRKSTDS VP I S KSTLSRSI^SIjQASDFDGAS 
S SGNPEAVALAPDAYSTGSS SASSTIjKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTIiPLTTAPEAGEVTPS DSGGQEDS PAKGHS VR LEFDYS EDKS 
S WDNQQENP P PTKKI GKKP VAKMPLRRP KM KKTP E KLDNT PAS P 
PRSPAEPNDI PIAKGTYTFD r DKWDDPNFNPFSSTSKMQES PKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLN KPAKKKKTPLKTDT FRVKKSPKRSPIjSDP PSQDP 
TPAATPETPPVISAVVKATDEEKLAVTNQKWTCMTVDIiEADKQD 
YPOPSDLSTFVNETKFSS PTEELDYRNS YEI E YMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETB 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
S EA I E I TAPEG S FASADALLS RLAHP VS IiCGALD YZiEPDLAE KN 
PPLFAQKLQRBAAHPTDVSISKTAliYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKTIAQMIEDEQREKSVS\HQTVQQLVIjEKEQa\IiADIjNSVEK 
\ SLADLFRRYBKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 

yqalkvha\eekldranae\iaqvrgkaqqeqaahqaslaerss 

CRVXDAIiERTLEQKNKEIEEIiTKICDELIAKMGKS 


5396 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTiCKPRPPSLKKKQTTK 
KPTETP P VKETQQE PDEES LVPSGENIiASETKTBS AKTEGPS PA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGE VTPSDSGGQEDS 2AKGHSVRLE FDYSBDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPAS? 
PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
PQQS YNFD PDT CDES VDPFKTSSKTPS SPSKSPASFE IPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRS PLSDPPSQD? 
T PAAT PET P PV I S AWHATDEE KLAVTNQ KWTCMTVDIiEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFBETE 
ALVNTAAKNQHPVPRGIAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIEI TAPEGS FASADALLS RIAHPVSLCGALDYLEPDLAEKN 
PP LFAQKLQREAAHPTDVS I S KTALYS R I GTAE VEKPAGLLFQQ 
PDLDSALQ IARAE 1 ITKEREVSEWKDKYEESRRE VMEMRKI VAE 
YEKTI AQM I EDEQREKS VS\ HQTVQQLVLEKEQA\LADLNS VE K 
\ S LADLFRR YEKMKE VLEGFRKNEE VL KRCAQE YLSR VKKEEQR 
YQALKVHA\EBKLDRANAE\ I AQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLE QKNKE IEELTKI CDEI* IAKMGKS 


5397 


3135 


531 


RASDAKNQEGLLNTRRXSTDSVPISKSTLSRSLSLQASDFDGAS " 

S SGNPEAVAIiAPDAYSTGSS S AS STLKRTKXPRPPSLXKKQTTX 

KPTET P P VKETQQE PDEESL VP S GENLASE TKTES AKTEGPS PA 

LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPIiTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 

SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEICLDNTPASP 

PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
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Amino acid segment containing signal peptide 
(A^Alanine, CsCysteine, D^Aapartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=»Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptopban, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDES VDPFKTSS KTPSS PS^S^AS FE IPASAME" 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVI SAWHATDEE KLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEBLDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
AL VNTAAKNQH P VP RG LA PNQ ESHLQ VP E KSSQ KELEAMGLGT P 
SEAI B I TAPEGS FAS ADALLS RLAHP VSLCGALDYLEPDLAEXN 
P PLFAQ KLQREAAHPTD VS I S KTALYS R IGTAEVE KPAGLL FQQ 
PDLDSALQIARAEI ITI02REVSEWKDKYEESRREVMEMRKI VAE 
YEKTI AQM I EDEQREKSVS \HQTVQQLVLEKEQA\ LADLNS VEK 
\ S LAD LFRRYE KMKEVLEG FRKNEE V LKRCAQE YLSR VKKE EQR 
YQALKVHA\ EEKLDRANAE \ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5398 


56 


5426 


SGBVCRMESNFNQEGVPRPS YVFSAD P I ARPSE I N FDGI KLDLS 
HE FSLVAPNTEANS FES KD YLQ VCLR I R P FTQS E KELESEG CVH 
ILDSQTWLKEPQ C I LGRLS E KS SG\QM \ AQKFS FFPG FLG PAT 
TQKEFFQGClMHP\VKDLLKGQSRIiIFTYGLTNSGKTYTFQGTE 
ENI R I LPRTLNVL FDS LQERL YTKMNLKPHRSRE YLRLS S EQE K 
EEI ASKSALLRQI KE VTVHNDSDDTL YGS LTNSLNI SE FE ES I K 
DYEQANLNMANSIKFSVWVSFFEIYNEYIYDLFVPVSSKFQKRK 
MLRLSQDVKG YS FI KDLQWIQVSDSKBAYRLLKLG I KHQSVAFT 
KLNNASSRSHSIFTVKILQIEDSEMSRVIRVSEI>SLCDIJ\GSER 
TMKTQNEG ERLRETGNINTSLLTLGKC I NVLKNS E KS KFQQHVP 
FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYDETLNVIiKFS 
A1AQKVCVPDTLNSSQEKLFGPVKSSQDVSLDSNSNSKILNVKR 
ATISWENSLEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 
TLEENKAFISHEEKRKLLDLIEDLKKKLINEKKEIOiTLEFKIRE 
EVTQEFTQYWAQREADFKETLLQEREI LEENAERRLAI FKDLVG 
KCDTREEAAKDICATKVETEEATACLELKFNQ I KAELAKTKGEti 
I KTKEELKKRENES DS L IQELETSNKK 1 1 TQNQR I KEL INI I DQ 
KEDTINEFQNLKSHMENTFKCNDKADTSSLI INNKL I CNETVEV 
PKDSKSKICSSRKRVNENELQQDEPPAKKGSIHVSSAITEDQKK 
SEEVRPNIABIEDIRVLQENNEGIjRAFLIiTIENELKNEKEEKAE 
LNKQIVHFQQ2LSLSEKKNLTLSKBVQQIQSNYDIAIAELHVQK 
S KNQEQEE KIMKLSNE IETATRS ITNNVSQ IKLMHTKI DELRTL 
DS VSQISNIDLLNLRDLSNGS EEDNL PNTQ LDLLGND YL VSKQ V 
KEYRIQE PNRENS FH S S I E A I WEECKE I VKASS KKS HQ I EEL E Q 
QI EKLQAEVKGYKDENURLKEKEHKNQDDLLKEKETL I QQLKBE 
LQEKNVTLDVQIQHWEGKRALSELTQGVTCYKAKIKELETILE 
TQKVERS HS AKLEQD I I*E KES 1 1 LKLERNLKEFQEHLQDS VKNT 
KDIiNVKELKLKEEITQLTNNLQDMKHLliQLKEEEEETNRQETEK 
L KEELS AS SARTQN\LNADLQR KEED YADL KE KLTDAKKQI KQ V 
QKEVSVMRDEDKLLRI KINE£iEKKkNQCSQELDMKQR\TIQQLK 
EQLINQKVEEAIQQYERACKDLNVKEKIIEDMRMTLEEQEQTQV 
EQDQ VL \ BAKLBEVERIiATELDR WR VKCNDL E TKNNQR SNKEHB 
NNTDVLGKLTNLQDELQESEQKYNADRKKWLEEKMMLITQAKEA 
ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 
ERDQLVAALEIQLKALISSNVQKDNEIEQLKRIISETSKIETX2I 
MD I KPKR I S S ADPDKLQTEP LSTS FE I SRNK I EDGS WLDS CEV 
S TENDQSTRFP KPELB I QFT PLQ PNKMAVKH PG CTT P VTVKIPK 
ARKRKSNEMKEDL VKCENKKNATPRTNL K F PI S DDRNS S VKKEQ 
KVAIRPSSKKTYSLRSQAS I IGVNLATKKKEGTLQKFGDFLQHS 
PS ILQSKAKKI IETMSSSKLSNVBASKENVSQPKRAKRKLYTSE 
ISSPIDISGQVILMDQKMKESDHQIIKRRLRTKTAK 


5399 


705 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 
AS P TPGE VQRHLQTHG I DGNG ELD FST FLT IMHMQI KQEDP KKE 
ILLAMLMVDKEKKGYVMASDLRS KLTS LGEKLTHKEV\ DDLFRE 
\AD I EPNGiCVKYDEFIHKITS YLDGTY 


5400 


931 


248 


SiiCS SGME I PPTN YPAS RAALVAQN Y I NYQQGTPHRVFB VQKVK 
QASMEDI PGRGHK YRLKFAVEE 1 1 QKQVXVNCTA3 VL YPS TGQE 
TAPEVNFTFEGETGKNPDEEDNTFYQRLKSMKEPIiEAQNl\PDN 
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Amino acid segment containing signal peptide 
(A=Alanine , C=Cyeteine , D=Aspartic Acid , E» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HeHistidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P-Proline, QsGlutaraine, R^Arginine, 
S«Serine, T^Threonine, V-Valine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=spossible nucleotide insertion) 








FGNVSPEMTDVLHliAWVACGYIIWQNSTEDTWYKMVKlOTVKQV " " 
QRNDD F I EliDY T I L LHNIAS QE I I PWQMQ VLWH PQYGT KVKHNS 
RLPKEVQLE ' 


5401 


3 


1360 


TGWS YGPTTSLAFLAPRDFP PP PKLIiI H PQAWRLS CGAGSMGS 
QAAAE WRN WASWEGS S SLSG CSMGCF KDDR I VFWTWMFS TY FME 
KWAPRQDDMLFYVRRKLAYSGSESGADGRKAAEPEVEVEVYRRD 
SKKLPGLGDPDIDWEESVCLNLILQKLDYMVTCAVCTRADGGDI 
H IHKKKSQQVFAS PS KH PMDS KGEESK I S YPNI FFM I D S F\ BE \ 
VFSDMT VGKGEMVCVE L VAS DKTNTPQG VI FQGS I R YEALKKVY 
DKrRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHAEMA 
VSRVSTGDTS PCGTEEDSSPAS PMHERVTSFSTPPTPERNNRPA 
F FS P SL KRKV PRNR I AEMKKS HS ANDSE E FFREDDGGADLHKAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHIjTYVTLPIiHRI 
LTDILEVRQKPILMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEHV 
PGADI LNS YAGLAC VEEPNDMI TESSLDVAEEE I IDDDDDD I TIi 
TVEASCHDGDETI ETI EAAEALLNMDSPGPMLDEKR INHNIFS S 
P EDDMWAP VTHVS VTLDGI PEVMETQQVQB KYAD S PGASSPEQ 
P KRKKGRKTKPPRPDS PATTPN I SVKKKNKDGKGNT I YLWEFLL 
ALLQDKATCP KYI KWTQREKG I FKL VDS KP V$ RLWR KHKN KP \ D 
MNYE PMGRALRYYYQRGI LAKVEGQRLVYQFKEMPKDL 1 YINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVEVAQPS EVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDE TLNSS VQS IR\T IQAPTQVPVWS P 
RNQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MT VLKENVMLQS QKAGS PPSIVLG PARV\QQVLTSNVQT I CNGT 
VS V\ ASSPS FS \ ATAP WTLFLJjGSSQLVAHPPGTVITSVI KTQ 
ETKTLTQE VBKKESEDHLKENTEKTEQQPQPYVMWSS SNG FTS 
QVAMKQNELLEPNS F 


5403 


3445 


1563 


GEC FI MAAVVQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADI LNS YAGLACVEE PNDMITES SLDVAEEE 1 1 DDDDDD ITI* 
TVEASCHDGDETIETIEAAEALLNMDSPGPMLDEKRINNNI FSS 
PEDDM WA PVTHVS VTLDG I PE VMETQQ VQEK YAD S PGAS S P EQ 
PKRKKGRKTKP PR PDSPATTPNI SVKKKNKDGKGNTI YLWEFLL 
ALLQDKATCP KY I KWT Q RE KG I FKLVDS KPVSRLWRKH KNKP \ D 
MNYEPMGRALR Y Y YQRG I LAKVEGQRL VYQ FKEMPXDL I Y I ND E 
DPSSS I ESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVEVAQPS EVLRTVQPTQSPYPTQLFRTVHWQ 
P VQAVPEGEAARTS TMQDETLNSSVQS I R\TIQAPTQVPVWS P 
RNQQ\ LHTVTLQTVPLTTVIASTDPS AGTGSQKFILQAI PS SQP 
MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 
VSV\ ASSPSFS \ ATAP WTLFLLGSSQL VAHPPGTVI TSVI KTQ 
BTKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPNS F 


5404 


187 


1111 


LPVTLIFAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRIIYDYGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI I PNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTGNLIEDIEDGTFSKL 
SLVEELSIiAENQLLKLPVLPPKLTLFNAKYNKIKSRGIKANAFK 
KLNNLTFLYLDHNALESVPLNLPESLRVIHLQFNNIASITDDTF 
CKANDTSYIRDRIEEIRLEGNPIVI^3KHPNQPTrT.WT5T dtpcvc 


5405 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
ILSLDQIKAIRGSNEYTEGPSWKRPAPRTAPRQEKHERTHE 1 1 
PINVNNN YEHRHTSHLGHAVLPSNARGPI LSRS TS TGSAASSGS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQH KF I CE QCGKCKCGE CTAPRTLPS CLACN RQ CLCSAE 
SMVEYGTCMCL\VKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVY CKLE S CP SRGQG KP S 


5406 


279 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMLENY 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E*= 
Glutamic Acid, P= Phenyl alanine, G^Glycine, 
H=Histidine, I»Isoleucine, K-Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P=* Proline, Q^Glutamine , R=Arginine, 
S=Serxne, T=Threonine , V=Valine, 
Wt=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








RNI*VPLG/ 1 1 AVS KPD Li I T CLEQEKE PW E PMRRH BMVAKP P VMC" " 

SHPTQDFWPEQHIKDPFQKATLRRyKNCEHKNVHLKKDHKSVDE 

CKVHRGGYNGFNQCIiPATQSKIFLFDKCVKAFHKFSNSNRHKlS 

HTE KKLFKCKE CGKS F CMLSHLAQ HK I IHTR VNFCKCEKCGKAF 

NCPS I ITKH KR INTGE KP YTCE E CGKVFNWSS RLTTHKKN YTR Y 

KLYKCEECGKAFNKSS ILTTHKI I RTGEKFYKCKECAKAFNQSS 

NI.TEHFOCIHPGEKPYKCEECGKAF^PSTLTKHKRIHTGEKPYT 

CEECGKAFNQFSN LTTHKR IHTA\EKFYKCTECGEAFSRS \SNIi 

TKHKEIHTEKKPYKCEECGKAFKWSSKLTEHKL.THTGEKPYKCE 

KCGKAFNCPS I ITKHNRINTGEKPYTCEECGKVFNWSSRLTTHK 

KNYTRYKLYKCEECGKAFNKSSILTTHKKIHIEKXFYKCEECGK 

AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSIL.TKHKRIHT 

GBKPYKCEECGKAFTQSSNLTTHKKIH'l'GEKFYKCEECXSKAPTQ 

SSNJbTTHKKIHTGGKPYKCEECGKAFNQFSTLTXHKIIHTEEKP 

YKCEECGKAFKWSSTLTKHKIIHTGEKPYKCEECG\KAFKLSST 

LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 

EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 

NKIHTGEKXiYKPEDVTVI LTTPQTFSNI K 


5407 


3 


659 


RPRRRQSSCCTGWIAGWLLRAAPRFCRRTETDMEQGKGLAVLIL 
AIILLQGTLAQSIKGNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATI SGFLFAE 1VS I FDLAVGVYFIAGTGMEFR 
QS \RASDKQTLLP \NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 


S408 


2745" 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRIiP " 
HARQHT PJjPLGSAD YRRWS VR PQGPHRDPKDS RDAAKRE QGS L 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVRHDTYPVGTQGVPSLALAQGGPGGSWRFLEWKSMP 
RLPTDLDrGG P WFPH YDFERSCWVRAISQEDQLATCWQAEKCGE 
VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSMLGNTCFM 
NSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQBLWSGTQKNVAPXiKLRWTIAKYAPRFNGFQQQDSQELLAFIi 
LDGLHEDLNR VHEKP YVE LKD S DGRPDWE VAAEAWDNHLRRNRS 
I VVDLFHGQLRS Q VKCKTCGH I S VRFDP FNFLSL PLPMDS YMHI* 
EITV I KliDGTTPVRYGLRLNMDEKYTGLKKQLfSDLCGLtNSEQI L 
LAEVHGSNIKNFPQDNQKVRL.SVSGFLCAFEIPVPVSPISASSP 
TQTDFSSS PSTNEM FTLiTTNGDLi PRP I F I PNGMPNTW PCGTE X 
NFTNGMVNGHMPSLPDS PFTG Y I lAVHRKMMRTELYFIiSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEASNH 
AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAA^WHPTALHIiRYQTSQERWDEHESVEQSRRAQ 
VE PINLDS CLiRAFTS EEELGENEMYYCSKCKTHCIiATKKIiDIiWR 
LPPILIIHLKRFQFVNGRWIKSQKIVKFPRESFDPSAFLVPRDP 
ALCQH K PliT PQGDELS E PR I L ARE VKKVDAQS SAGE EDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLS S SKENLDAS KENGAGQ ICELADALSRGH 
VLGGSQPELVTPQDHBVALANGFLYBHEACGNGCGNGYSNGQriG 
NHSEEDSTDDQREDTRI KPI YNLYAISCHSGILGGGHYVTYAKN 
PMCKW YC YNDS SCKB LH PDE I D TDS AYI LF YEQQG I D YAQ FL P K 
TDGKKMADTSSMDEDFESDY\ EKYCVLQ 


5409 


2745 


6128 


qgskgtchpqaqqpwdegWqeApsqsepwgqsqepptmpqrlp 
harqhtplplgsadyrrwsvrpqgphrdpkdsrdaakreqgsl 

" rR v v "wkuua Ji u\~ A.o i KUAir POP PAQ r QRP 1 C3AS PP WASRF 
STPCPGGAVREDTYPVGTQ^VPSliALiAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLiATCWQAEHCGE 
VRNKDMS W P E EMS F I ANS S KI DRHKVPTE KGATGliSNLGNTCFM 
NSSIQCVSNTQPLTQYFISGRHLYELMRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWT IAKYAPR FNG FQQQDS QEL1AFL 
LDGLHEDLURVHEKPYVELKDSDGRPDWEVAAEAWDNHLRJINRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
EirVIKLDGTTPVRYGI.RLNMPEKYTGLKKQLSDJ^CGLNaEQIL 
LAEVHGSNI KNFPQDNQKVRLS VSGFLCAFEIPVP VS PISASSP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AtAlanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G^Glycine, 
H=Histidine, I-Ieoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, TsThreonine, V^Valine, 
w<rryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




tqtdfs s s ps tnem ftlttngdlprp i f i pngm pntvvp cgte k 
nftngmvnghmpslpdspftgyi iavhrkmmrtelyflssqknr 
pslfgmpllvpctvhtrkkdlydavwiqvsriiaspbppqeasnh 
aqdcddsmgyqypftlrwqkdgnsc^wcpwyrfcrgckidcge 

DRAFIGNAYIAVDWHPTAliHtjRYQTSQERWbEHESVEQSRRAQ 
VEPIWLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDliV?R 
LPPILI IHLKRFQFVNGRWIKSQKI VKFPRES FDPSAFLVpRDP 
ALCQHKPLTPQGDELSEPRI LAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGS KNKLSSS KENLDAS KENGAGQ I CE LADALS RGH 
VLGGSQPELVTPQDHEVAIiANGFLYEHEACGNGCGNGYSNGQLG 
NHSEEDSTDDQREDTRIKP I YNLYAI SCHSGILGGGHYVTYAKN 
PNCKW YC YNDS SCKELH PDE I DTDSAY I LFYEQQG IDYAQFLPK 
TDGKKMADTS SMDEDFES DY \ EKYCVLQ 


5410 


2 


710 


LRPPGQARHVWLAARMQAPHKEHL YKL2JVIGDLG VGKTS I I KRY 
VHQNFSSHYRATIGVDFAIiKVIjHWDPETWRLOIjWDIAGQERFG 
NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKIiSLPNG 
KPVSWLI^ANKCDQGKDVIiMNNGLKMDQFCKEHGFVGWFETSAK 
ENIKIDEASRCLVKHI LANECDLMES I EPDWKPHLTSTKVASC 
SG\CAKI LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGS FGKPS PVTGLRAARRRRTR PSAPAAPS VGC~~ 
GKRRESDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQPN 
GGEPFL I GVSGGTASGKSS VCAKI VQLIjGQNE VDYRQKQ WILS 
QDSFYRVLTSEQKAKALKGQFNFDHPDAFDNELILKTliKEITEG 
KT VQI P V YDFVSHSRKEETVTVYPAD WLFEG I LAF YS QBR / 1 R 
DLFQM KL FVDTDADTRLSRR VLKD I S E RGRDLE Q I L S S STLR F V 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\DI 
LNGG PS \NRQTNGCI*NGYTPS RKRQASES SSRPH 


5412 
5413 - 


3180 


313 


QGISNFFHKEANFWFEVSG YL ISPLRS PFVDPAIiEWSI^MAS PWN 
KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHEIFRDSSLV 
NEQSQ I TRR KKRKKDFQHIj I S S PIiKKS R I CDETANATSTIiKKRK 
KRRYSAI*E VDEEAGVT WLVDKENINNT P KHFRKDVD WC VDMS 
IEQKLPRIC\ PKTDKFQVZjAKS H \ AHKS EALHS KVREKKNKKHQR 
KAASWESQRA\RDTIiPQSEFPTQEES WLS VGPGGEI TELP \ ASA 
HKNKS KKKKKKS SNRE YET \ IiAMPEGS QAGRE AGTDMQESQ PTV 
GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESIiAMPEGSQVGS 
EVGADMQES \RPAVGLHGETAGI P APA YKNKS KKKKKKSNHQEF 
EAVAMPES LESA YPEGSQVGS E VGTVEGS TALKG FKESNSTKKK 
S KKRKLTS VKRARVSGDDFSVPS KNSBSTLFDS VEGDGAMMEEG 
VKSRPRQ KKTQACLAS KHVQEAPRLEPANEEHNVBTAEDSE IRY 
LSADSGDADDSDADIjGSAVKQLQEFIPNI KDRATSTI KRMYRDD 
IiERFKE FKAQGVA I KFG KFS VXEKTFCQLE iOTVEDFIALTG XES AD 
KLLYTDRYPEEKSVITNLKRRYSFRLHIG\RNXARPWKLIYYRA 
KKMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 
SVALKFSQISSQRNRGAWSKSETRKLIKAVEEVILKKMSPQBLK 
EVDSKLQEWPESCLSrVREKLYKOISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRRIYYGMNALRAKVSLrERLYEINVEDTNEI 
DWEDLAS AI GD Vp PS YVQTKFS RLKAV YVP F WQK KTF P E I IDYL 
YETTLPL LKE KLE KMMEKKG TK I QTPAAPKQVFPFRD I FYYEDD 
S EGGGHRKRKRRPRRHAWFTP V I PVLWEAKAGWI I 




3753 


1304 

< 

1 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPLLRRTARPGGG 
TPLLNGAGPGAAROSP1?<3AT.PPW3HMQCWt nnrrr r c»r»\ nwrinn 

H PFP KE I PHNEKLLSL KYESLD YDNS ENQL FLEEERR I NHTAFR 
TVEIKRWVICAL1GILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTE KGGLS FS LL LWATLNAAF VLVGS VI VAF I EP VAAGSG I PQ 
IKCFLNGVKIPHWRLKTLVIKVSGVII.SWGGLAVGKEGPMIH 
SGSVI AAG I SQGRSTSLKRDFKI FE YLRRDTEKRDFVS AGAAAG 
VS AAFGAPVGGVLFSLEEGAS FWNQFLTWRI FFASMI STFTIiNF 
VLSrYHGNMWDLSSPGLINFGRFDSEKMAYTlHEIPVFIAMGW 
3G VLGAVFNALN YWLTM FRIR Y I HRP CXQVI EAVLVAAVTATVA 
FVLI YS S RDCQ PLQGGS MS YP LQLFCADGE YNSMAAAF FNTPE K 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
j sequence 


| Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Xsoleucine, K~Lysine, 
L=Leucine, M=Methionine, N*Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V^valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *eStop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








S WS LFHDP PGS YN PLTLGL FTL V YFFLACWT YGLTVS AG VF I P 
SLL I GAAWGRL ?G I S LS Y LTG AA I WADPG KYALMGAAAQLGGI V 
RMTLfl n XVIMMEATSNVT YGFP IMLVLMTAKI VGDVFI EGLYDM 
HIQLQSVPFBHWEAPVTSHSLTAREVMSTPVTCI/RRREKVGVIV 
DVLS DTASNHNG FP WEHADDTQ PAR LQGL I LRSQL I VLLiKHKV 
FVERSNLGLVQRRLRLKDFRDAYPRFPP IQS IHVSQDERECTMD 
LSEFMWPSPYTVPQEASLPRVPKLFRALGI.RHLWVDNRNQWG 
LVTRKDLARYRLGKRGLEELSLAQT 


5414 


j 2130 


390 


GVAS AWDRAL FS PLLS PTSRVFRTS PPRC VSTETGRRDRARVPS 
QWCS VLQGKL P VSGRTS LACVRS I LLS PAS S PRKVG I VGGTGAR 
AGAAPRDHGRVRHRRPSSARRMTRTTGQCXAPRGCQG PRGTRS p 
RS PRS RTRRGCS AS PACLP / CRS All I VAVLC Y IN r JJNYMDRFTV 
AG VL P D I EQFFN I GD S S 5GL I QTVF I SS YMVLAPVFG YLGDR YN 
RK YLMCGG I AFWS L VTLGS S F I PGEH FWLliLLTRGliVGVG EAS Y 
STIAPTLIADLFVADQRSRMLSIFYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGIXJVVAVLLIjFLVVREPPRGAVERKSDLPPL 
NPTS W WADURAIiARN PS F VliSSLG FTAVAFVTGS LALW APAFLiL 
RSRWLGETPPCLPGDSCSSSDSLIFGLITCLTGVIjGVGLGVEI 
5 RRLRHSNPRAD P L VCATGL LGSAP FL FLSLACARGS 1 VAT Y 1 F 
IFIGETLLSMNWAIVADILLYVVI PTRRSTAEAFQIVXiSHLLGD 
AGSPYLIGLISDRIJ^WPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


I PPKTKLE LQKH \ LTTLT \NQEQAT 1 FEE VQKLRPRNEQRENEL 

IISFLRCLFEBKQKEHIHIGEMKQTSQMAAENIGSEliPPSATRF 

RLDMLKNKAKRSiTESLES ILSRGNKARGLQBHS I S VDLDS S LS 

STLSNTSKEPSVCEKEALPISESSPKLLGSSEDLSSDSESHLPE 

EPAPLSPQOAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLM 

R YHSVSTET PH EK KDFES KANHLGDSGGTPVKTRRHSWRQQI FL 

RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 

EKKRTSRELRELWQKAI LQQ I LLLRME KENQKLQAS ENDLLN KR 

LKUDYEEirPCLKEVTTVWEKMLSTPGRSKIKFDMEKMHSAVGQ 

GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKELLKQLT 

SQQHAI L 1 DLGRTF PTH P YFS AQLGAGQLS LYN 1 L KAYS LLDQ2 

VGYCQGLS F VAGI LLLHMS EE EAFKMLKB*LMFDMGL»RKQYRPDM 

IILQIQMYQLSRLLHDYHRDLYNHLEEHEIGPSLYAAPWFLTMF 

ASQFPLG F VARVFDM I FLQGTE VI FKVAIiSLLGSHKPLI LQHEN 

LETIVDFIKSTLPNLGLVQME1CTINQVFEMDIAKQLQAYEVEYH 

VI^EELIDSSPLSDNQRiWDIOwEKTNSSLRKQNLDLLEQLQVANG 

RI QSLE AT X EKLLS S ES KL KQAMLTLE LERS ALLQT VEBLRRRS 

AKPSDREPECTQPEPTGD 


5416 


27 


4074 

1 

3 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
YVDDIQKGNTIKRLWIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RR PRLAS FGGMG TTS S LPS F VGSGNHNPAKHQLQNG YQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RS VAVGAEENMNDI VWHRGS RS CKDAAVGTIiVEMRNCGVS VTE 
AMLGVMTEADKE I ELQQQTI ESLKEKI YRLEVQLRETTHDREMT 
KLKQELQAAGSRKKVDKATMAnPT.\n^c!UT/\7p7vx7\ronnDn r^rtr 

MDLVDTCVGTSVETNSVGISCQPECKNKWGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTJLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDIfcTS STKTRS IGVGTLLSGHSGFDR 
PSAVXTKESGVGQINIWDNYI*VGLKMRTIACGPPQIiTVGIiTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
rLLAENYSELAEAFGEPHSWGSLUSQLlSTIjSSINSVMKSAST 
SELRNPDFQKTSLGKlTGSYIiGYTCKCGGIiQSGSPLSSQTSQPE 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5417 



Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I«Isoleucine, K= Lysine, 
L^Leucine, M^Methionine, N^Asparagine, 
P-Proline, Q^Glutamine, R-Arginine, 
S-Serine, T-Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknown, *=*Scop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 

qevgtsegkpissldafptqegtlspvnlYodqiaaglyactnn 
estlks imkkkdgnkdsngakknlqf vg ingg yetts sddss sd 
essssesddecdv1eyplebeeeeedbdtrgmaeghhavniegl 

KSAR VEDEMQVQECEPEKVE I RERYELS EKMLSACNLLKNT IND 
PKALTSKDMR FCLNTLQHE WFRV5 S QXSA I PAMVGDY I AAFEA I 
S PDVLR Y V I N LADGNGNTALHYS VSHSNFE I VKLLI JDADVCNVD 
HQNKAGYTPIMLAALAAVE1AEKDMRIVEELFGCGDWAKASQAG 
QTALMLAVSHGR I DM VKGLLA CGAD VN I QDDEGSTALMCASEHG 
| H^IVKLLLAQPGCNGHLEDNDGSTALSIALEAGHKDIAVIjLYA 
HVN?AKAQSPGTPRLGRKTSPGPTHRGSFD 



27 



5418 



24 



5419 



1335 



5420" 



TTT 



4074 1 KSQLFCFWGGXAGDILSGDQDXEQKDPYFVETPYGYQLDLDFLK 

YVDDIQKGNTIKRLNIQICRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFI^IARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQIiPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTSS LPS F VGS GNHNPAKHQLQNG YQGNGDYG 
S YAP AAP TTS SMGS S I RHS P LS SG I ST P VTNVS PMHLQH I REQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQ INVCGVRKR3 YSAGNASQLEQLSRARRSGGELYI DYEEEEME 
TVEQ3TQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
R S VAVGAEENMND I WYHRGSRS CKDAAVGTLVEMRNCG VS VTE 
AMLG VMTBADKE I E LQQQT I E S LKE KI YRLEVQLRETTHDREMT 
KLKQELC^GSRKKVDKATMAQPL^SKWEAVVQTRDQMVGSH 
MDLVDTCVGTS VETNS VG Z SCQPECKNKWGPELPMNWW I VKER 
VEMHDRCAGRS VEMCD KS VS VB VS VCETG SNTE ESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTS TDLEQ VHQF TNTETATL IES CTNTCLS TLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
P SAVKTKESG VGQIN I NDKf YLVGLKMRT IACGPPQI .TVGLTASR 
RS VGVGDDP VGES LEJNPQPQAPLGMMTGLDH YIERIQ KLLAEQQ 
TLIAENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EE LRN PDFQKTSLGKI TGS YLG YTCKCGGLQSGS PLS SQTS QPE 
QBVGTSEGKP X S S LDAFPTQEGTLS P VNLTDJDQIAAGL YACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQXSAI PAMVGDY I AAFEAI 
SPDVLRYVINLADGNGNTALHYSVSHSNFEIVKLLLDADVCNVD 
HQNKAG YTPIMLAALAAVEAEKDMR I VEELFGCGDVNAKASQAG 
QTALMLAVSHGR I DMVKGLLACGADVNIQDDEGSTALM CAS EHG 
HVB I VKLLLAQPG CNGHLEDNDGS TALS I ALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 



1133 



S VPRAGGDME TGAAELYDQALLGI LQH VGNVQDFLRVLFG FLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQ P PG P VKEMAHGS QEAE APGAVAGAAE VPR\ E P ? I 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQ VS VALSS S S I R VAMLEENG ERVLM3GKLTHKINTES S L WS L 
EPGKCVLVNLS KVGE YWWNAI LEGEEP IDIDK INKERS MAT VDE 
EEQAVLDRLT FDYHQKLQGKPQSHBLKVHEMLKKGWDAEGS PFR 
GQRFDPAMFNISPGAVQF 



GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEKSPKLGPRAAEEG 
S ES E ACEAFGRRKS E EEGRRS DTSG FGRSRKHKVNWKHP ERADA 
KDPASLPQC/I.GP/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRI QQW QQ S PC I AEEHG KKLLER IRREQQSARTRLQEMERRFH 
ELEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VALRHMBRCYAKYESQTSFGSMYPTRIEGATRLFCDVYNPQSKT 
YCKRLQVLCPEHSRD PKVPADE VCG C PLVRDVFELTGDFCRLP K 
RQCNRHYCWEKLRRAEVDLERVRVWYICLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 



1 7 3 3 | NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG" 

GGFLPARPPRAQRHLGFSHAEOSMEAPDYEVLSVREQLFHRRTR 
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1 SEQ 
ID 
J NO: 


rreu4, Cue a 

beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C^Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P«Proline, Q^Glutamine, R»Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 








ECIISTLLFATLYILCHIFLTRFKKPAEFT^GMMKMPPSTRV~~ 
I LLELCTFTLAI ALGAVLLL PFS 1 1 SN3VLLS LPRNY YI QWLNGS 

LIHGLWNLVFLFSNLSLIFLMPFAYFFTESEGFAGSRKGVLGRV 
YETVVMLMLLTLLVLGMVWVASAI VDKNKANRESLYDFWEYYLP 
YL YS C IS FLG VLLLLVC TP LGLARMF S VTGKLLVKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLIMELLHRQVLALQTQRVL 
LEKRRKASAWQRNLGYPLAMbCLLVLTGLSVLIVAIHILELLID 
EAAM PRGMQGTS LGQVS FS XLGSFGAVI QWL I F YLMVSS WGF 
YS S PLFRS LR PRWHDTAMTQ 1 1 GNC VCLLVLS S ALPVFS RTLGL 

TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
) RAELIRAFGERE 


5421 " 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
ECIISTLLFATLYILCHIFLTRFKKPAEFTT\GMMKMPPSTRL/ 
LLELCTFTLAI ALGAVLLLPFS 1 1 SNEVLLSLPRNYYIQWUWGS 
LIHGLWNLVFLFSNLS LI FLMPFAYFFTESBG FAGSRKGVLGR V 
YETWMLMLLTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 
YLYSC I SFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QL YCS AFEEAALTRRI CNPTS CWL P LD MELLHRQ VLALQTQR VL» 
LEKRRKASAWQRI«,GYPLAWLCLLVLTGLSVLIVAIHILELLID 
EAAM PRGMQGTS LGQVS FSKLGSFGAV IQWLI FYLMVS SWGF 
YSSPLFRSLRPRWHDTAMTQI IGNCVCLLVLSSALPVFSRTLGL 

TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


j 5422 


3 


1263 


SCGESLPTWIAGASRPGIGRKGGAWGGRGGSSPAQVLLSPGPVF 
KAQCNWWHLSRDQAG VQRCDLGSSQ P PPLGFKRFS CLSLPSS WD 
| YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMG WP PGTQVEQLT .YAKKLYDSAF 
HPDTGEKMNVIGRMSFQLPGGMIITGFMLQFYRTMPAVIFWQWV 
NQS FNALVNYTNRNAAS PTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP P LVGRW VP FAAVAAANCVN I PMMRQQEL I KGI CVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERIjEKLH 
FMQKVKVL/SAPLQVMLSGCFLIFMVPVACGLFPQKCELPVSYL 
EPKLQDTI KAKYGELEPYVYFNKGL 


5423 j 


3186 ~ 


905 


GVSMALGEEKAEAEASEDTKAQS YGRGS CRERELDI PGPMSGEQ 

P PRL EAEGGLI S P VWGAEG I PAPTCW I GTDPGGPS RAHQPQASD 

ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 

EFPQTLSLPRTTI CSGHDADTEDDPSLADLPQALDLSQQPHS SG 

LSCXSQWKSVLSPGSAAQPSSCSISASSlXaSSLQGHQERAEPRG 

GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 

GRRRLSFQAEYWACVLPDSLPPSPDRHSPLWNPNKEYEDLLDYT 

YPLRPGPQLPKKLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 

TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGLASW 

SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 

RTRDRGW P S PRP ERE KRTSQS ARRPTCT E SRWKS E EE VESDDE Y 

LALPARLTQVSSLVSYLGSISTLVTLPTGDIKGQSPLEVSDSDG 

PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 

EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 

GGEQGKE SL VQC \ VKTFC \ CQLEELICWL YW\ AD VTDHGTPAR 

SNLTSLK\SSLQLYRQFKKDIDEHQSLTESVLQKGEILLQCLLE 

NTPVLEDVLGRIAKQSGELESHADRIjYDS ILASLDMLAGCTLI P 
DJCKPMAAMEHPCEGV 


5~424 [ 


3186 


905 


GVSMALGEEKAEAEASEDTKAQS YGRGS CRERELD I PGPMSGEQ 
PPRLEAEGGLISPW7GAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSIAKVSS SLEPWpQEPSS WGLGPRPQWSPQPVFSGGDASGL 
3RRRLSFQAEYWACVLPDSLPPSPDRHSPLWNPNKEYEDLLDYT 
Sf PLRPGPQLP KHLDS RVP ADP VLQDSQVDLDS FS VS PAS TLKS P 
rMVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGIASW 



318 



WO 01/53312 



PCT/US00/34263 



ID 
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corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 

nucleotide 
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c or re spond i ng 
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amino acid 

sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I*=isoleucine, K»Lysine, 
L«Leucine, M=Methionine, N^Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V« Valine, 
W=Tryptophan, Y» Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








sqlastprapgsrdarwerrepaLrgakdrltigkhldmgspql 

RTRDRGWPSPRPBREKRTSQSARRPTCTESRWKSEEBVESDDEY 

iialparltqvsslvsyiigsistlvtlptgdikgqsplevsdsdg 
pasfpssssqsqlppgaalqgsgdpegqnpcflrsfvrahdsag 
egslgssqalgvssgllktrpslparldrwpfsdpdvegqlprk 
ggeqgkeslvqc\vktfc\cqleelicwlynv\advtdhgtpar 
snltslk\sslqlyrqfk:<didehqsltesvlqkgeillqclle 
ntpvledvlgr i axqsgeleshadrlyds ilasldmlagctli p 
dkkpmaamehpcegv 


5425 


1086 


115 


GFCPSPSIA3HQPPRVLHPTMSMAVETFGFFMATVGLLMLGVTLP 
NSYWRVSTVHGNVITTNTIFENLWFSCATDSLGVYNCWEFPSML 
AL SG Y I QACRALMI TA 1 LLG FLGLLLG IAGLRCTNIGGL E LS R K 
AKLAATAGAPH\ ILPGICGMVAI \SWYAFNITR\DFSDPLYPGT 
KYELGPALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRP 
YQAP VS VMP VAT S DQ EGDS S FGKYGRNALR VAALCRG PRCL PTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLI F 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP " 

PAAHAKPDPGSGGQPAGPGAAGBALAVLTSFGRRLLVLI pvyla 

GAVGLSVGFVLFGLALYLGWRRVRDEKEKSLRAARQLLDDEEQL 

TAKTL YMS HRELP AWVS FP DVE KAEWLNKI VAQVW PFLGQ YMEK 

LLAETVAPAVRG SNPHLQT FTFTRVE LGEKPLR I IGVKVHPGQR 

KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 

BPLIGDLPFVGAVSMFFIRRPTI^INWTGMTNLLDIPGLSSLSD 

TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLpRGIIRIHL 

LAARGLSSKDKYVKGLIBGKSDPYALVRLGTQTFCSRV1DEELN 

PQWGETYEVMVHEVPGQElEVEVFDKDPDKDDFIiGRMKLDVGKV 

LC»SVLDDWFP1jQGGQGQVHI,RLBWI j SLLSDAEKI,EQVLQWNWG 

VS SRPDPPSAAI L WYLDRAQDLPMVTS ELYP PQLKKGNKEPNP 

MVQLS IQDVTQES KAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 

KDDSRALTI^ALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 

KLVMR I L YLDS SE I CFP T VPGCPGAWD VDSENPQRGSS VDAP PR 

PCHTTPDSQFGTEHVLR I KVLEAQDL I AKDRFLGGLVKG KSDP Y 

VKLKLAGRSFRSHWREDLUPRWNEVFEVIVTSVPGQEIiEVEVF 

DKDLDKDDFWRCKVRLTTVI*NSGFI»DEWLTLEDVPSGRIjHLRL 

E R LTPR PTAABLEE VLQ VNS L I QTQKS AELAAALLS I YME RAE D 

L PLR KGTKHLS P YATLTVGDS S HKTKT I SQTS AP VWDES AS FL I 

RKPKTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 

S SGQGQ VLLRAQLG I LVSQHSG VEAHSHS YSHS SS S LSEEP ELS 

GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPIX5QVKLTLWY¥SE 

ERKLVS I VHGCRSLRQNGRDP PDP YVSIjLLLPDKNRGTKRRTSQ 

KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 

LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 

■« 
1 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP" 
PAAHAKPD PGS GGQPAGPGAAGEALAVLTS FGRRLLVL I P VYLA 
GAVGLS VG F VLFGLAL YLG VJRRVRDE KERSLRAARQLLDDE EQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEX 
LLAETVAPAVRGSNPHLQT FTFTR VELG EKPLR I IGVKVHPGQR 
KEQILLDLNlSYVGDVQIDVEVKKYFCKAGVKGMQliHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTMLIiDIPGLSSLSD 
TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
LAARGLSSKDKYVKCiT.TRfiTf cinDViT vpt rsTrwrrcDUTriBut vr 

PQWGET YEVMVHBVPGQE I B VE VFDKDPD KDDFLGRM KLD VG KV 
LQASVLDDWFPLQGGQGQVHLRLEWLSIiLSDAEKLEQVLQWNVra 
VS SRPDPPSAA I LWYLDRAQDLPMVTSEI>YPPOI»KKGNKEPNP 
MVQLS IQDVTQES KAVYSTNCPVWEEAFR FFLQDPQS QELDVQV 
KDDSRALTLG ALTLPLARLLTAPELI LDQWFQLS S SG PNSRL YM 
KLVMR ILYLDSSE ICFPTVPGCPGAWDVDSENPQRGS SVDAP PR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
/KLKLAGRS FRSHWREDLN PRWNEVFEVI VTS VPGQELEVE VF 
DKDLDKDDFI^RCKVRLTTVIiNSGFIJ)EWLTLEDVPSGRLHLRL 
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Amino acid segment containing signal peptide 
(AeAlanine, C-Cysteine, D=A3partic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine f 
HeHistidine, I»Isoleucine, K-Lyeine, 
L=Leucine, M^Methicnine, N-Asparagine , 
P=Proline, Q=Glutarr.ine, R«Arginine, 
S= Serine, T=Threonine, V= Valine, 
WoTryptophan, Y=Tyrosine, X=UnJcnown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAALLS I YMERAED 
L PLRKGTKHLS PYATI»TVGDSSHKTKTI SQTSAPVWDES ASFL I 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEVXRQRI/THVDSPLEAPAGPLGQVKLTLWYYSE 
ER KL VS I VHG CRSLRQNGRDPPDP YVSIiLI>LPDKNRGTKRR TS Q 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 
5429 


3 


1839 


SSRSERLSACAIAPPWLVSSRPARPAQLQRPGKMVEDGAE£fl7ED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
LAAS I P YFHAMFTNDMMECKQDE I VMQGMDPSALEALINFAYNG 
NLAI DQQtTVQS L LMGAS FLQLQS I KDACCT FkRERI*HPKNCI*GV 
RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFIALPLEDVLEL 
VSRDELNVKSEEQVFEAA1AWVRYDREQRGTFL\RNLQSNIRLL 

FCR pqfls dr vqqddl vrcchkcrdl vdeakd yllmp errphlp 

AFRTRPRCCTS I AGLI YAVGGLNSAGDSLNWEVFDP IANCWER 
CRPMTTARSRVGVAWNGLLYA1GGYDGQLRLSTVQAYNTETDT 
WTRVGSMNSKRSAMGTWLDGQI YVCGG YDGNSSIiSS VETYS PE 
TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 
YNHH TATWHPAAGMLNKRCR HGAAS LGSKMFVCGGYDGSGFLSI 
AEMYS S V\ ADQWCLI VPM\HTRR \ SRVSIXJGPAVGRLYAVWG VT 
TGQSNL\SSVGDVLTPETDCWTFM\APMACHEGGVGVGCIPLLT 


5430 * 


828 


202 


RREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDLPPTISIiSDGEEPPPYQGPCTIiQ 
LRDP EQQLELNRE S VRAP PNRTI FDSDLMDSARLGGPC P P SSNS 
GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 
LEGTRLHHTHIAPLESAAIWSKEKDKQKGHPL 




441 


1507 


QKRRK^RRKKIMKTIQPK>1HNSISWAIFTGLAALC^FQGVPVRS 
GD ATFP KA14DNVTVRQGE SATIjRCT I DNRVTRVAWLNRST I I*YA 
GNDXWCLD PR WLL SNTQTQ YS I EIQNVDVYDEG P YTCS VQTDN 
HPKTSRVHLIVQVSPKIVE1SSDISINEGNNISLTCIATGRPEP 
TVTWRHISPKAVGFVSEDEYLEIQGITREQSGDYECSASNDV\A 
APV\VRRVKVTVNYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRLI/EGKKGVKVENRPFLSKL1FFNVSEHDYGNYT 
CVASNKLGHTNAS I MLFG PGAVS E VSNGTS RRAG C VWLLPLLVIj 
HLLLKF 


5431 
5432 




1312 


AAAAPGSRRRRPI,PDRPHMAHGYEAPPPPAPRSPAWRARSKPV\~ 
LPGITINP\TIAEGPSP\TSEGASEANIjVDLOKKLEELELDEQQ 
KKRIiEAF IjTQKAKVGEI*KDDDF ER I SEDGAGNGG WTKVQHR PS 
Ghl MARKLIHIjE IKPAI RNQI I RELQVLiHECNSP YI VGFYGAF Y 
SDGEISICMEWKDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 
YLREKHQ I MHRD VKPSNI LVNSRGE I KLCD FG VSGQL I DSMANS 
FVGTRS YMAPERLQGTO YSVQSD I WSKGLSL VEIxAVGRYPIPPP 
DAKELEAIFGRPWDGEEGEPHSISPRPRPPGRPVSGHGMDSRP 
AMAI FELLDYI VNB P P PKIjPNGVFTPDFQEFVNKCLI KNPAERA 
DLXMLTNHTFIKRSEVEEVDFAGWLCKTLRLtfQPGTPTRTAV 


5433 " 


C 


1312 

: 

J 
i 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSRPVV~ 
LPGITINPXTIAEGPSPXTSBGASEANLVDLQKKLBELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERIS ELGAGMGGVVTKVQHRPS 

oLIMAJRKLi THTiRT KPATPNTit TDi?rjTirr uppmo n v ti m mr^ , t,, , 
" "viwjxiiJjuxivrrtiAlNyj, XK&uUV ±jnjL*-NZ*V x IVQFYGAFY 

3DGE IS I CMEHMDGGSLDQVLKE AKRI PEE ILGKVS I AVLRGLA 
YLREKHQ IMHRDVKPSNT LVNSRGE I KL CD FGVSGQLI DSMANS 
FVGTRS YMAPERLQGTH YS VQSD I W S MG L S L VE LAVGRY P I PP P 
3AKELEAI FGRP WDGEEG EPHS I S PRP RPPGRP VS GHGMD S R P 
^MArFELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
3LKMI>TNHTFlXRSEVEEVDFAGWIiCKTLRLNQPGTPTRTAV 




360 


1885 J 
I 

£ 


3 VQEDKVG FE D P LHLCS WRARACPCT W PHC/ CTGL*LECLG FAGV 
jFGWPSLVFVFKNEDYFKDLCGPDAGPIGNATGQADCKAQDERF 
>LI FTLGS FMNNFMTFP TG YI FDR FKrTVARL IA I FF YTTATL I 
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Amino acid segment containing signal peptide 
<A*rAlanine, C- Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F*»Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K«Lysine, 
L= Leu cine, M=Methionine, N=Aspar agine , 
P= Proline, Q=Glut amine, R=Arginine, 
S* Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I AFTSAGSAVLLFLAMPMLT IGG I LFLITNLQ IGNLFGQHRSTI 
ITLYNGAFDSSSAVFLIIKLLYEKGISIjR/VLLHLHLCLQYLAC 
STHFPPDAPGAHPIPrAPQLQLWPVPWBWHHKGREXS/QQLSMKT 
G S YSQRS S FQRRKR PQGQGRS RNS APSGATL / CSRR FAWHLVWL 
SVIQLWHYLFIGTLNSLLTNMAGGDMARVSTYTNAFAFTQFGVL 
CAPWNGLLMDRLKQKYQKEARKTGSS TLAVALCS TVPS LALTSL 
LCLGFALCAS VPILPLQYL.TF I LQVISRS FL YGSNAAFLTLAFP 
S EHFGKL FG L VMAL S AWSLLQ FP 1 FTL I KG S LQNDP F YVNVMF 
MLAILLTFFHPFLVYRECRTWKESPSAXA 


] 5434 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/GSGK' 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFI GC JNYPECEHTBLIDKPDBTAI TCPQCRTGHhVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGECPECHYPLLIEKKT 
AQGVKHFCASKQCGKPVSAE 


5435 


4704 


1597 


1 PGDSSQRLAEMSNAKERKHAKKMRNQPIim*LSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQEIPKYITASTFAQARAAEISAMLKAV 
TQKSSNSLVFQTLPRHMRRRAMSHNVKRLPRRLQEIAQKEAEKA 
VHQ KKEH S KNKCHKARRCHMNRTLE FNRRQKKN I WLETH I Wl IAK 
RFHMVKKWGYCLGERPTVKSHRACYRAMTWRCLLQDLSYYCCLB 
LKGKEEE I LKALSGMCN I DTGLTFAAVHCLSGKRQGSIiVL YRVN 
KYPREMLiGP VT F I WKSQRTPGDPS ES RQLW I WLHPTLKQD I LEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTEIiPDEKIGKKRKR 
KDDG ENAKP I KK I IGDGTRDPCL P YS WIS PTTG 1 1 1 SDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DSVSLHCRQEAIFELLGGITSPAEIPAGTILGLTVGDPRINLPQ 
KKSKALPNPEKCQDNBKVRQLLLEGVPVECTHS F IWMQD I CKSV 
TENKI SDQDLNRMRSELLVPGSQL ILGPHESKI P ILLIQQPGKV 
TGEDRLGWGSGWDVLLPKGWGMAFWI PFI YRGVRVGGLKESAVH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLIjEKYKRRPPAKRPN 
YVKLGTLAP FCCP WEQLTQDWESRVQAYEEPS VAS S PNG KE SD L 
RRSEVPCAPMPKKTHQPSDEVGTSIEHPREAEEVMDAGCQESAG 
PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLSILGKFPRAIiVWVSLSLLSKGSPE 
PHTMICVPAKEDFLQLHEDWHYCGPQESKHSDPFRSKIUKQKEK 
KKREKRQKP \GRAS SDG PAGEE P VAG QE AL TLGL W 3 G PLPRVTL 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTOLLDMLSSQPAAQ 
RGLVLLRPPASLQYRFARIAIBV 


5436 


1781 


635 


ASDSIPWSEARTTRKLAQRGCQWSLPERMPLWFCGLPYSGKSR 
RAEElJiVATiAAEGRA\nfVVnDAAVLGAEDPAVYGDSAREKALRG 
ALRASVERRLSRHDWILDSLmriKGFRYELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSS VLREliHTADS WKGS AQADVPKELER BESGAAES PAL VTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQLDQVTS 
QVLAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAELSRLRR 
QFISYTKMHPNNBNLPQLANMFLQYLSQSLH 


5437 ~ 
"5438" " 


739 


1672 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
P RR VD S S S ENS GSDWDSAPETMED VGH PKTXDSGALR VSRAASE 
PS KEE PQVEQLGS KRMDSLKWDQ P ISS TQESGRLE AGGAS PKLR 
WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGWLR 
GEPGAPSRYI/SGPEECWISTNLTLHLLELLASALLALCSRPLR 
AALDTLGLRGPLGLWLHGLLSFLAALHGLHAVLSLLTAHPLHFA 

CLFGLLQALVLAVSLREPNGDEAATDWBSEGLEREGEEQRGDPG 
KGL 




2443 


1152 

■ 

J 

1 


rKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPMMCQSEARCX3PBLRAAKWLHFPQLALRRRLGQLSC 
^ S RPALKLRS W PLTVL YYLL PFGALRPLSRVG WRPVS R VALYKS 
/PTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
3LHHYRNLSEFFRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 
"TCEVEQVKGVTYSLES FLGPRMCTEDLPFP PAASCDS FKNQLVT 
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Amino acid segment containing signal peptide 
(AsAlanine, CsCysteine, D-Aspartic Acid, E-» 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N=:Asparagine, 
P=Proline, Q=Glutamine, R«=Arginine, 
S=Serine, T»Threonine, V= Valine, 
W^Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHPPaSIiMSVNP " 
GMARWIKELFCHNERWLTGDWKKG FFSLTAVGAT \NWGS IRI Y 
FDRDLHTNSPRHS KGS YNDFSFVTHTNREGVPMALRGEHLG/QS 
FNLGSTI VLI FEAPKD FNFQLKTGQKIRFGEAIiGSL 


* 5439 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGXGRKBENKGSDRVS 
LAPPSLRRPMMCQSEARQGPELRAAKXTLHFPQLALRRRLGQIiSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVAIiYKS 
VPTRLI^RAWGRLNQVELPHWliRRPVYSLYIWTFGVNMKEAAVE 
DLHH YRNLSE F FRRKL KPQARPVCG LHS VI S PS DGR I LNFGQV K 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQbVT 
REGNELYHCV I YLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARW I KEliFCHNER WLTGDWKHGF FS LTAVGAT \ NWG S I R I Y 
FDRDLHTNS PRHS KGS YN D F S F VTHTNREG VPMALRGEHLG / QS 
FNLGSTI VLI FEAPKDFNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


E P I P VT PDHRL VTMTH I V \ QTFS P VNS \GQPPNYEMLKEEQE VA 
MLGAPHNPAPPMSTVIHIRSETSVPDHWWSLFNTLFMNTCCLG 
FIAFAYS VKS RDR KMVGDVTGAQAYAS T A KCLN I W AL I LG I FMT 
ILLI I IPVTjWQAQR 


5441 


2 


2054 


CRDGGXNGFM VS PMKPLE I KTQCSGPRMDPKICPADPAFFS FIN 
MSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTG Y WWCPTAS WEG S E GLKT LR I L YEEVDES E VEV IHVPS P 
ALEERKTDSYRYPRTGSKNPKIALKIiAEFOTDSQGKIVSTQEKE 
LVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLL 
PPAIj F T PSTENEEO V RLAJ5AR AVPRNVrtP v mm VP w\rrivn/wTTM\rtr 

DI FYPFPQSEGEDELCFLRANECKTG FCHLYKVTAVLKSQGYBW 
SEPFS PGEGEQSLTNAIWVNE ETKLV YFQGTKDTPLEHHIjYWS 
YBAAGEI VRLTTPGFS HS CSMSQNFDMFVSHYSS VSTPPCVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPTVLFVYGGPQVQLVNNSFKG I KYLRLNTLASLGY 
AWVI EX3RGSCQRGLRFEGALKNQMGQ VEI EDQ VEGLQFVAEKY 
GFIDLSRVAIHGWSYGGFLSLMGblHKPQVFKVAIAGAPVTVWM 
AYDTGYTERYMDVPENNQHGYEAGSVAIjHVEKLPNEPNRLLILH 
GFLDENVHFFHTNFLVSQLIRAGKPYQIjQVALPPVSPQIYPNER 

hs ircpesgehyevtllhflqeyl 


5442 


1 


3474 


cgqrsrrrspdmpeakpaakkapkgkdapkgapkeappkeapae " 

APKEAPPEDQSPTAEEPTGVFIjKKPDSVSVETGKDAVVVAKVNG 

kelpdkptikwfkgkwlelgsksgarfsfkeshnsasnvytvel 
higkvvlgdrgyyrlevbcakdtcds cgfkidveaprqdasgqsl 
esfkrtsekksdtageldfsgi»lkkrevveeekkkkkkddddlg 
I ppe iwellkgakkse yekiafq yg itdlrgmlkrlkkakve vk 

KS AAFTKKIiDPAYQVDRGNK 1 KLMV3 1 SDPDLTL KWFKNGQEI K 

psskwfenvgkkriltinkctladdaayevavkdekcftelfv 
keppvlivtpledqqvfvgdrvemavbvseegaqvmwmkdgveii 
tredsfkaryrfkkdgkrhilifsdwqbdrgryqvitnggoce 
aeliv^ekqlevt^diadlrvkaseqavtkcevsdekvtgkwyk 

NGVEVRPSKRITISHVGRFHKIiVIDDVRPEDEGDYTFVPDGYAL 
GSLSAKLNFLEIKVEYVPKQ\EPPKI PLGFASGGKTSENAD/ 1 V 
WAGNKLRLDV\SITGEAPSPFAT\WLKG\DEVFTTTEGRTRIE 
KRVDCSS F VI ES AQREDEGRYTI KVTNP IGED VAS 1 FLQ WD VP 
DPPEAVRI TSVGEDWAI LVWEPPMYDGGKP VTGYLVERKKKGSQ 
RWMKLNFEVFTETTYESTKM1EGILYEMRVFAVNAIGVSQPSMN 
TKPFMPIAPTS3PLHLIVEDVTDTTTTLKWRPPNRIGAGGIDGY 
LVEYCLEGSEEWVPANTEPVSRCGFTVKNLPTGARILFRWGVN 
I AGRS E PATLAQ P VT I RE I AEP P K I R LP RHLRQTYI RKVGEQLN 
LWP FQG KPR PQWWTKGGAPLDTS RVHVRTSD FDTVFFVRQAA 
RSDSGEYELSVQIENMKDTATIR3RWEKAGPPINVMVKEVWGT 
NAL VEWQAP KDDGITSE IMG YFVQKAD KKTMEW FNV YERNRHTS C 
TVSDL I VGNE YYFRVY TENI CGLSDS PGVSKNTAR ILKTG ITFK 
P FE YKEHDFRMAP KFLTPI» I DR WVAG YS AALNCAVRGHPKP KV 
VWMKNKME I REDPKFL I TNYQG VLTLNIRRPSP FDAGT YTCRAV 
NBLGEALAECKIiEVRVPQ 
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S«Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


S443 


66 


1003 


SRGQhDAGQSS EQHGGNRQPEQSRSRSSS S SSS PRRSRSAAE PA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKGWFS VTTVDLKRKPADLQNLAPGTHPPFI TFNS EVKTDV 
NKIEEFLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAVIKNS 
RPEANEALERGLLKTLQ KLDE Y LNS PLPDEI DENSMED I KFSTR 
KFLDGNEMTLADCNLLPKLHIVKWAKKYRNFDIPKEMTGIWRY 
LTNAYSRDE FTNTC PS DKEVE I \ AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPTPDYTESDILRAY 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YME P YD AQ WVMS ELPGRG VQL YDTP YEEQDP ETAJDG P P SGQKPR 
QSRMPQEDER PADE YDQPWEWKKDHI S RAFAVQFDS PE WERTPG 
SAKELRRPPPRS PQPAERVDPAL PLE KQPV7FHG PLNRADAESLL 
S LCKEGSYLVRLSETN PQDCSLS LRSSQGFLHLKFARTRENQW 
LGQHSGPFPS VPEpVLHYSSRPLPVQGAEHLALLYP WTQTP* Q 
♦PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGIiHRBRHPEGLP 
RAEKPGLRGPL1.GLRE PLGAGPRGP WGLQE PRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


~~ 5445 


2364 


486 


ILSRGFLGSVEICIQLPIjPASEPVLLiTWARRRWRETRSRREPT 
TLRAQSVCPWWI*ETRMNRSIPVEVDESEPYPSQLLKPIPEYSP 
EEESEPPAPNIRNMAPNSIiSAPTMLHNSSGDFSQAHSTLKOANH 
QRPVSRQVTCLRTQVLEDS EDS FCRRHPGLGKAFPSGCS AVSE P 
ASES WGALPAEHQFS FMEKRNQWLVSQLSAASPDTGHDSDKS D 
QSLPNASADSLGGSQEMVQRPQPKRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLERPLPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMIiPPNTjSPHAPWNYHYHCPGSPDHQVPYGHDYPRAAYQQVIQP 
ALPGQPLPGASVRGLHPVQKVILNYPSPWDQEERPAQRDCSFPG 
LPRHQDQPHHQPPNRAGAPGESLBCPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVWFLLV 
NGFQTAIDI FEDRIRGIDI IKWMBR YiiRDKTVMI I VAIS PKYKQ 
DVEGAESQLDEDSHGLHTKYIHRMMQIEF1KQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTKVYSWPKNKKNILLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SSWSWCTGRMRKTRLWGLLWMLFVSELRAATKLTEEKYELKEGQ 
TLDVKCDYTLEKFASSQKAWQI IRDGEMPKTLACTERPS KNSHP 
VQVGRI ILEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQP PKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYKIPPTTTKALCPLYT 
TPRTVTQAPPKS TADVS TPDS EINLTNVTDI IR VP VFN I VI UjA 
GGFLSKS LVFSVLFAVTLRS FVP+AHEPTRMSSDFQPHPSGSCA 
KGGGRR 


5447 


207 


617 


MTARTLS LMAS LVAYDDSDS EAETEHAGSFNATGQQKDTSGVAR 
PPGOD FASGTLD VPKAGAQPTKHGS CEDPGG YRLPLAQLGRSDR 
GSCPS QR LQW PG KEPQVTF PIKEPSCSS LWTSHVPASHM PLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCWPY 
TPRRLRQRQALSTETGKGKD VEPQGP PAGRAPAPL YVG PG VS E F 
I Q P Y LNS HYKETTVPRKVL FHLRGHRGPVNTI QWCP VLS KSHML 
LSTS MD KTF KVWNAVDSGHCLQTYS LHTEAVRAARWAP CGRR I L 
S GGFD FALHLTDLETGTQL FSGRSD FR I TTLKFHPXDHN 1 FLCG 
GFS S EMKAWDI RTGKVMRS YKATIQQTL DILFLREG SE FLS STD 
ASTRDSADRT 1 I AWDFRTS AKI SNQ 3 FHERFTCPS LALHP R E P V 
t J-iAU i jnkjim x UAL, * SXVWP YR MSRRRR YEGHKVEG YS VGCECS PG 
GDLLVTGS ADGRVLM YSFRTAS RACTLQGHTQAC VGTTYH P V LP 
S VLATCS WGGDMXI WH*AFHWLSLGEA IGDLAPARG YSG PGRS L 
KSP S P S KS LLVLLCGRAM FQ PATCP WQL PALS K 


5448 


194 


1833 


MASKVTDA1VWYQKKIGAYDQQIWEKSVEQREIKGLRNKPKKTA 
HVKPDLI DVDLVRGSAFAKAKPESPWTS LTTKGI VRWFFP FFF 
RWWLQVTSKVIFFWLLVLYLLQVAAIVLFCSTSSPHSIPLTEVI 
G PI WLMLLLGTVHCQ I VS TRTP KP PLS TGGKRR RKLRKAAHLE V 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLMHAAFFLS 
GS KKAKNS I DKS TETDNG Y VS LDGKKT VKSGEDG IQNHEPQCET 
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P=Proline, Q-Glutamine, R=Arginine, 
S*Serine t T*Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X^Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLVJEDLLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIWrEGNDCKKADMS 
VLE1SGMIMNRVNSH I PGIG YQI FGNAVSL J LGLTPF VFRLS QA 
TDLEQIjTAHSASELYVIAFGSNEDVIVLSMVIISFWRVSLVWI 
FFPLLCVAERT YKQVG IM * TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTS CSSRCSSSRQDSES ARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHbPWLHSSHPGLEKISAI 
VWEGNDCKKADMS VLB I SGM I MNRVNSHI PG IG YQIFGNAVSLI 
LGLTP FVFRLSQATDLEQLTAHSAS EL Y VI AFGSNEDVI VLS MV 
I IS FWR VSLVWI FFFLL C VAERT YKQVGIM 


5449 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLR^fKPKKTA 
HVKPDL I DVDLVRGSAFAKAXPES P WTS LTTKG I VR WF FPFFF 
RWWLQVTSKVI FFWLLVLYLLQVAAIVLFCSTSS PHSIPLTEVI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNS IDKSTETDNG YVS LDGKKTVKSGEDG I QNHEPQCST 
I RPEETAWNTG TLRNGPS KDTQRT I TNVSDEVS S EEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
Q OSES ARP ES ETEDVLWEDLLHCAE CHS SCTS BTD VENHQ INPC " 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSAS ELYVI AFGSNEDVI VLSMVI I S FWRVSLVWI 
F FFLLC VAERT YKQVG I M * TS EG VLRNRKSHH Y KKH YPNEDA? K 
SGTS CS S RCS S S RQDSES ARPES ETE D VI»WEDLLHCAECHS S CT 
S ETDVENHQIN PCVKKE YRDDPFHQS HLPWLHSS HPGLE K I SAI 
VWEGNDCKKADMSVLE ISGMIMNRVNS H I PGIGYQI FGNAVSL I 
LGLTP FVFRLSQATDLEQLTAHSASBLYVl AFGSNS DVI VI*SMV 
I ISFWRVSLVWIFFFUjCVAERTYKQVGJM 


5450 


" B136 


1242 


GQQFAS FFG* NHPE VT VAMALTDI DLQLQFSMSQPEAUjBLAAG 
PADHLLIiQLYSGHLQVRLVIjGQEELRLQTPAETLIiSDSIPHTVV 
LTVVEG WATLSVDG FLNAS SAVPGAPL E VP YGLF VGGTGTLGL P 
YLRGTSRPLRGCLHAATLNGRSLLRPLTPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTIiTTQSRQAPLAFQAGG 
RRGDF I Y VD I FEGHLRAWE KGQGTVLLHNS VP VADGQPHEVS V 
HINAHRLE I S VDQ YPTHTSNRG VLS YLEPRGSLL LGGLDAEAS R 
HLQEHRLG LTPE ATNAS LLG CMEDLS VNGQRRGLRE ALLTRNMA 
AG CRLEE EE YEDDAYGHYB AFS TLAPB AW PAMEL PE PC VPE PGL 
PP VF ANFTQ LLT IS PL WAEGGTAWIjE WRH VQP TLDLMEAELR K 
S Q VLFS VTRGAH YGELE LD I LGA QARKM FTLLD VVNR KARF I HD 
GSEDTSDQLVLEVSVTARVPMPSCLRRGOTYLLPIQVNPVNDP? 
HI IFPHGSLMVI LEHTQKPIiGPEVFQAYDPDSACEGLTFQVLGT 
S SGLP VERRDQPGEPATEFS CRELEAGSL VYVHCGG PAQDL TFR 
VSDGLQAS PPATLKWAI R PAIQ IHRS TGLRLAQGSAMP ILPAN 
LS VETNAVGQDVS VLFRVTGALQ FGELQKHS TGG VEGAE WWATQ 
APHQRDVEQGRVRYLSTDPQHHAYDTVENLALEVQVGQEILSNL 
S FPVTI QRATVWMLRLEPLHTQNTQQETLTTAHLEATLEEAGPS 
PPTFHYEVVQAPRKGNLQLQGTRLSDGQGFTQDDIQAGRVTYGA 
TARASEAVEDTFRFRVTAPP YFS PL YT FP IHIGGDPDAP VLTNV 
LLWPEGGEGVLSADHLFVKSLNSASYLYEVMERPRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATRQGE 
S S GDWAWE E VRGVFRVAIQP VNDHAPVQT I S R I FHVARGGRRLL 
TTODVAFS DADSGFADAQLVLTRKDLLFGS I VAVDEPTRPI YRF 
TQEDLRKRR VLFVHS GADRGWIQLQ VSDGQHQATALLEVQAS EP 
YLRVANGSSLWPQGGQGTIDTAVLHLDTNLDIRSGDEVHYHVT 
AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLSPEDTMAF 
S VEAGPVHTDATLQ VTI ALEGPLAPLKL VRHKKI YVFQGEAAEI 
RRDQLEAAQEAVPPADI VFSVKS PPSAG YLVM VSRGALADE PPS 
LDPVQS FS QEAVDTGRVL YLHS RPEAWSDAFSLDVASGLGAPLE 
5VLVELEVLPAAIPLEAQNFSVPEGGSLTLAPPLLRVSGPYFPT 
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Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N°Asparagine, 
P=Proline, Q=Glut amine, R-Arginine, 
S= Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKBDGPQARTLSAFSWRMVEEQLIRYV 
HDGSETLTDSFVLMANASEMDRQSHPVAFTVTVIjPVNDQPPILT 
TNTGLQMWBGATAP I PAEALRSTDGDSGSEDIiV YTIEQPSNGRV 
VLRGA PGTE VRS FTQAQ LDGGL VL FSHRGTLDGG FPFRLSDGEH 
TSPGHFFRVTAQKQVLIiSl»KGSQTIiTVCPGSVQPLSSQTLRASS 
SAGTDPQLI,LYRWRGPQLGRLFHAQQDSTGEA1jVNFTQAEVYA 

GN I LYEHEM p pe pfweahdtlelqlss ppardvaatlavavsfe 

AACPQRPSHLWKNKGLWVPEGQRARITVAAIiDASNLLASVPSPQ 
RSEHDVLFQVTQFPSRGQLLVSEEPLHAGQPHFLQSQLAAGQLV 
YAHGGGGTQQDGFHFRAHLQGPAGASVAGPQTSEAFAITVRDVN 

erppqpqasvplrltrgsrapisraqlswdpdsapgeieyevo 
raphngflslvggglgpvtrftqadvdsgrlafvangssvagif 
qlsmsdgaspplpmslavdi lpsai evqlraplevpqalgrssl 
sqoqlrwsdreepeaayriilqgpqyghllvggrptsafsqfqi 
dqge wfaftnfs sshdhfr vlalargvnasavvnvtvraliihv 
waggpwpqgatlrldptvldagelanrtgsvprfrllegprhgr 

WRVPRARTEPGGSQIjVEQFTQQDLEDGRIjGIiEVGRPEGRAPGP 
AGDS LTLELWAQ GVP PAVAS LDFATEPYNAARPYS VALL S VPEA 

ARTEAG kpesstptgbpg pmass pepa vajecggfls FLEANMFS V 

IIPMCLVLLLLALILPLLFYLRKRNKTGKHDVQVl»TAKPRNGtiA 

gdtetfrkvepgqaipltavpgqgpppggqpdpellqfcrtpnp 

ALKNGQYWV 


5451. 


1 


2274 


rdsseqgrtgdtlgrpsacmdalkppclwrnhergkkdrdscgr 

KNSEPGSPHSLEALRDAAPSQGIjNPLLLPTKMLFIFNFLFSPLP 
TPALICII/TFGAAIFLWIiITRPQPVI/PLLDLNNQSVGIEGGARK 
GVSQKNNDI/TS CC FSDAKTM YEVFQRGLAVS DNG PCLG YRKPNQ 
PYRWLSYKQVSDRAEYLGSCIiLHKGYKSSPDQFVGIFAQNRPSW 
1 1 SELAC YTYSMVAVPL YDTLGPEAI VH I VNKAD IAMVI CDTPQ 
KALVL I GNVE KGFTPSLKVI I LMDP FDDDLKQRGE KSGIEILSL 
YDAENLGKEHFRKPVPPSPEDE*SVlCFTSGTaX3DPXGAMITHQN 
I VSNAAAFLKCVEHAYEP TPDDVAI S Yt» PIAHMFER I VQAWYS 
CGARVG FFQGD IRLLADDMKTLKPTLFPAVPRIiLNR I YDKVQNE 
AKTPLKKFbLKIiAVSSKFKELQKGIIRHDSFWDKLIFAKIQDSL 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPLACNYVKLEDVADMNYFTVNNEGEVCI KG 
IWFKGYLJCDPEKTQEAIiDSDGWLHTGDIGRWLPNGTLKIIDRK 
KNIFKIAQGEYIAPEKIENIYNRSQPVLQIFVHGESLRSSLVGV 
WPDTD VLPS FAAJCLG VKGS FEELCQNQWREAI LEDLQKIGKE 

SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


S452 


1833 


1138 


SRVPSImCLiS LShShS PSREP VAGAPGCGTAGPPAMATLWGGIjLR 
LGSLLSIiSCLAIiS VLLLAQLSDAAKNFEDVRCKC I CPP YKEttSG 
HIYNKNISQKDCDCIiHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTI 1 1 YLSILGLLIXYMVYLTLVEPILKRRLFGHAQLI QS 
DDDIGDHQPFAJNAHDVLARSRSRANVLNKVEYAQQRWKLQVQEQ 
RKSVFDRHWLS 


S453 
""5454 " 


111 


1520 


PSIPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGBQAV^ 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERS QQQDD I E EIJ2TKAVGMSNDGR FLKFD I E I GRGSFKT VY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL 
RS WCRQILKGLQFLHTRTPP I IHRDhKCDNIFITGPTGSVKIGD 
IiGIATLKRAS FAKS VIGTPE FMAPEMYEEKYDES VDVYAPGMCM 
L EMAT S E Y P Y SECQNAAQI YRRVTSGVKPAS FDKVAI PE VKE 1 1 
EGCIRQNKDERYSIKDLZjNHAFFQEETGVRVELAEEDDGEKlAI 
KLWIiR IEDI KKLKGK YKDNE AI E FS FDLERNV PED VAQEMVE S G 
YVCEGDHKTMAKAI KDRVSL I KRKREQRQL * 




111 


1520 


PSIPAAVPQSAPPBPHREETVTATATSQVAQQPPAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEEI.ETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCEIiQDRKLTKSERQRFKEEAEMLKGLQHPNI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amir.o acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lyaine, 
L«Leucine, M»Methionine, NT»*Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, ! 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»»possible nucleotide insertion) | 








VRFY DS WES TVKGKKC I VI4VTELMTSGTLKTYI1 KR FKVMK I KVL 
RS WC RQI L KGLQ FLHTRT P P 1 1 HRDL KCDN I P I TG PTG S VK1GD 
LGLATLKRAS FAKS VIGTPEFMAPEMYEE KYDES VDVYAFGMCM 
LEMATSEYPYSECQNAAQI YRRVTSGVKPASFDKVAIPEVKEI I 
EG C IRQNKDER YS I KDLIiNHAFFQE ETC VR VELAEEDDGE KI A I 
KLWLRIEDI KKLKGKYKDNEAIE FS FDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKAIKDRVSLIKRKREQRQL* | 


"5455 ' 


1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAILPLLFGCLGVFGIjFRLLQ 
WVRGKAYLRNAWVITGATSGLGKECAKVFYAAGAKLVIaCGRNG 
GALEEblRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEII, 
QCFGYVDILVNNAGISYRGTIMDTTVDVDKRVMETNYFGPVALT 
KALhPSMIKRRQGHl VA XSS IQGKMS I PFRS A YAAS KHATQAF F 
DCLRAEMEQYEIEVTVISPGYIHTNLSVNAITADGSRYGVMDTT 
TAQGRS P VEVAQDVLAAVGICKKKD VI LADLLPSIAVYLRTIAPG 
LFFSLMASRARXERKSKNS j 


5456 


2 


2332 


CGAGLVAAGAVLVL YPASRAGERTRVP3SPAPS S LPLHS PGACG " I 
TBVDMDPQRS PLLEVKGNIELKRPlil KAPSQLPLSGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVP SLTTVPQTQGQTTA 
QK VS KKTGPR CSTAIATGLXNQKP VPAVP VQKS GTSGVP PMAGG 
KKPS KRPAWDLKGQUCDIiNAELKRCRERTQTLDQENQQLQDQLR 
DAQQQVK»IX5TERTTLEGHrAKVOAQAEQGQQELKNLRACVL,EL 
EERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 
ALS S S QAEVAS LRQETVAQAA1»LTER E ERLHGLEMERRRI1HNQL1 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSAIiDGYPVCXFAYGQTGSGKTFTMEGGPGGDPQLEGLI PR 
ALRHLFSVAQELSGQGWTYS FVASYVE I YNETVRDLLATGTRKG 
QGGECE I RRAGPGSEELTVTNARYVPVSCEKEVDALLHliARQMR 
AVARTAQNERSSRSHSVFQLQISGEHSSRGLQCGAPLSLVDIiAG 1 
SERIjDPGLALGPGERERLRETQAINSSLSTI.GLVIMALSNKESH 
VPYRNS KLTYIjLQNSIXSGSAJCMLMFVNISPIjEENVSESLNSLiRF 

askvepsvlfgtaqsnrkvjktdpdlcvcvcvcvcvcvcvcvcvp 
msm yr vrggrvaggcfig wrapcprai k 


5457 


2 


1540 


DDFVERRRWTRTTCLVRSPPHVPVCGHACSWNGGSLDPLKGTPA 

llrsaerlmrkvkklrldkentgswrs fslnsegaermattgtp J 

TADRGDAAATDDPAARFQVQKHSWDGLRS I IHGSRKYSGLI VNK 
APHD FQF VQKTDE S G PHSHRL Y YLGMP YG SRENS LLYS E I PKKV 
RKEALI/LLSWKQMLDHFQATPHHGVYSREEEIxI/RERKRI^VFGI 
TSYDFHSESGLFLFQASNSIiFHCRDGGKNGFMVSPGPGCVSPMK 
PLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEER [ 
RLTFCHQGDSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKT IjRI L YEE VDES EVE VI HVPS PALEERKTDS YR Y PRT 
GSKKPKIALKlAEFQTDSQGKIVSTQEKELiVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWI^LVLLPPALFI PSTENEEQA 
AShCQS CPQECPAVCG VRGGHQRLDQCS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVMEAQPEWLRAEV 1 

KRLSHELAErTREKIQAAEYGLAVLEEKHQLKLQFEEtiBVDYEA 

IRSEMEQLKEAFGQAHTNHKKVAADGESREESLI QESASKEQYY 

VRKVIiELQT ELKQLRNVLTNTQS ENERLAS VAQ E L KE I NQNVE I 

QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 

VEFEGLKHEIKRLEEBTEYLNSQLEDAIRLKEISERQL.EEALET 

LKTEREQKNSLRKELSH YMS INDS F YTSHI/HVS LDGI/KFSDDAA 

EPNNDAEALVNGFEHGGLAKLPLDNKTSTPKKEGLAPPSPSIiVS 

DLLSELNISEIQKLKQQLMQMEREKAGLLATLQDTQKQLEHTRG 

SLSEQQEKVTRLTENLiSAJjRR1»QASKERQTAIiDNEKDRDSHEDG 

DYYEVD I NGP E I IiACK YHVAVAEAGELREQLKALRS THEARE AQ 

haeekgryeaegqaltekvsllekasrqdrellarlekelkkvs 

D VAGE TQGSLS VAQDELVTF S EELANL YHHVCMCNNE TPNRVMIi 
DYYREGC3GGAGRTSPGGRTSPEARGRRSPILLPKGLLAPEAGRA 
DGGTGDSSPSPGSSLPSPLS0PRRBPMNIYNLIAI IRDQIKHLQ 
AAVDRTTELSRQRlASQELGPAVDKDKEALMEEILKIiKSLliSTK 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F= Phenyl alanine, G*Glycine, 
H=Histidine, 3>lsoleucine, K=Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S ^Serine , T-Threonine , v~val ine , 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQITTLRTVIjKANKQTAEVAIJ^LKSKYENBKAMVTETMMKLR 
NELKALKEDAATFSSLRAMFATRCDEYITQLDKMQRQLAAAEDJE 
KKTLNSLLRMAIQQKLALTQRLELLELDHEQTRRGRAKAAPKTK 
PATPS VSHTCA GAS DRA EGTGLANQVFCS EKHS IYCD 


54 59 


316 


1262 


RGGHRLSGMAS NFND I VKQG Y VRI RSRRLG I YQRCWL VF KKAS S " 
KGPKRLEKFSDERAAYFRCYHKVTELNWKNVARLPKSTKKHAI 
GIYFNDDTSKTFACESDLEADEWCKVLQMECVGTRINDISLGEP 
DLLATGVEREQSERFNVYLMPS PNU3C YMGECALQ ITYE YICLW 
DVQNPR VKL I S WPLS ALRR YGRDTTWFTFEAGRMCE TGEGLF J F 
QTRDGEA1YQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 
SI»STM VPL PRS AYWQHI TRQHS TGQLYRLQDVSS PLKLHRTETF 
PAYRSEH 


5460 


45 


2097 


RPGCRAGELSTGSRARERVRNRVSAPCGQDSRRCDPBVLRGRSP 
GLGliAEMPSCGACTCGAAAVRLITSSLASAQRGlSGGRIHMSVI, 
GRLGTFETQUiQRAPLRSFTETPAYFASKDGISKDGSGDGNKKS 
AS EGS S KKSGSGNSGKGGNQLRCPKCGDLCTHVETF VSSTRFVK 
CE KCHH FF WLS EADS KKS 1 1 KE PE S AAE AVKLAFQQ KP PPPP K 
KI YNYLDKYWGQS FAKKVL.S VAVYNHYKR I YNNIPANLRQQAE 
VEKQTSLTPRELEIRRREDEYRFTKIjLQIAGISPHGNAIiGASMQ 
QQWQQ 1 PQEKRGGE VL DS SHDD I KLBKSN I LLIiGPTGSG KTLL 
AQTLAKCLDVP FAI CDCTTbTQAG YVGEDI ESV I AKLLQDANYN 
VEKAQQGIVFLDEVDKIGSVPGIHQLRDVGGEGVQQGIiIiKLLEG 
TI VNV PEKNS RKLRGE TVQVDTTN I L FVASGAFNGLDR IIS RRK 
NEKYLGFGTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRIj 
LRHVE ARDL I E FGM I PE FVGRLP WVPLHSLDEKTLVQ I LTE PR 
NAVI PQ YQALFSMDKCELNVTEDALKAXARLAbERKTGARGLRS 
IMEKLIiLEPMFEVPNSDIVCVEVDKEVVEGKKEPGYIRAPTKES 
SEEEYDSGVBEEGWPRQADAANS 


3* 0 X 


1461 


160 


INPPPPPKSPCGRARKWRRRRRPGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLG IHVFLVS CALPD 
S VLRRF VVRTMCAVLGX>VARQEDSGLRDHS VR VL I SNHVT P FDH 
NIVNLLTTCSTPLLNSPPSFVCWSRGFMEMNGRGELVESLKRFC 
ASTRLPPTPUiLFPEEEATNGREGLLRFSSWPFSIQDWQPliTL 
QVQRPLVS VTVS DASWVS ELLWS L F VP FTVYQ VRWLRP VHRQLG 
EANEEFALRVQQLVAKEJLGQTGTRLTPADKAEHMKRQRHPRLRP 
QSRQS S FPPS PG, PS PDVQLATLAQR VKE VL PH V PLG VIQRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTOSLPTASASKFPSSGPV 
TPQPTALTFAKS £ WARQ ES IjQER KQAL YE YARRRFTERRAQ EAD 


5462 
5463" - 


663 


3353 


KIKERQMSANKS PPSAQKSVLPTAI PAVLPAASPCSSPKTGLSA " 
RLSNGS FSAPS LTNSRGS VHTVS FLLQI GLTRES VT I EAQELSL 
S AVKDLVCSIVYQKFPECGFFGMYDKII#IjFRHDMNSEN1 LQIjI t 
SADE IHEGDLVE WLSALAT VEDFQ IRPHTLYVHS YKAPT FCD Y 
CGEMLWGI,VROGliKCEGCGU^YHKRCAFKIPNWCSGVRlCRRLSN 
VS3L.PGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKMVMCRVKVPHTFAVHSYTRPTICQYCKRLLKGLFRQGMQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMD IDN 
NDINS DS S RGLDDTEEPS PP EDKM FFLDPS DLD VERDEE AVKTI 
SPSTSNWIPUyiRWQSIKHTKRKSSTMVKEGWIWlYTSRDNI,R^ 
RHYWRI»DSKCLTLFQNESGSKYYKEIPIiSEILRlSSPRDFTNIS 
QGSNPHCFEIITDTMVYFVGENNGDSSHNPVLAATGVGLDVAQS 
WEECAIRQALMPVTPQASVCTSPGQGKDHKDLSTS ISVSNCQI QB 

RFPTKQESQLRNEVAII^NLHHPGIVNLECMFETPERVFVVMEK 
LHGDMLEMILSSEKSRIiPERITKFMVTQlLVALRNLHFKNIVHC 
DLKPENVLLASAEPFPQVKLCDFGFARI I GEKSFRRS WGT PAY 
LAPEVLRSKGYNRSLDMWSVGVItYVSliSGTFPFNEDEDINDQI 
QNAAFMYPPNPWREISGEAIDLINNLLQVKMRKRYSVDKSLSHP 
W LQDYQTW1J3LR E FETR IGERY I THE S DDARW E IHAYTHNIA/ Y P 
KHFIMAPKPDPMEBDP 




237 


1012 


OLSVTMTTSRCSHLPEVLPDCTSSAAPVVKTVEDCGSLVNGQPQ"" 
yVMQVSAKDGQUjSTVVRTIATQSPFNDRPMCRICHEGSSQEDL 
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ID 

NO: 


freaicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 

Ami TiC^ 

C*IUJL1JV*J d X 

sequence 


Predxcted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptio"e 
(A»Alanine, C»Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I«lsoleucine, K^Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P«Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y=Tyrosine, X« Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEHWLSSSNTSYCBLCHPRFAVERKPR 
PLVEWLRNPGPQHEKRTLFGDMVCPLFITPLATISGWLCLRGAV 
DHLHFSSRLEAVGLI ALTVALFT I YLFWTLVS FR YHCRLYNEWR 
RTNQRVI LL I PKS VNVPSNQPS LLGLHS VKRNS KET W 


5464 


195 


€77 


SPSMNPRKKVDL.KLI IVGAIGVGKTSLLHQYVHKTFyEEyQTTri 
GAS ILSKI 1 1 LGDTTLKUJ1 WDTGGQERVRSMVSTFYKGSDGCI 
LAFDVTDLESFEALD IWRGDVIAKI VPMEQS YPMVLLGNXIDLA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5465 


5278 


3348 


1 KGDPREFIRVHREALECDYVSAHLHEWIDLIFGYKQQGPAAVBA 
VNVFHHLFYEGQVDIYNINDPLKETATIGFINNFGQIPKQLFKK 
PHPPKRVRSRLMGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
KELKEPVGQIVCTDKGIIiAVEQWKVIil PPTWNKTFAWG YADI/S C 
RLGTYES DKAMTVYECLSEWGQI LCAI CPNPKLVITGGTSTWC 
VWEMGTSKEKAKTVTLKQALLGHTDTVTCATASLAYHI IVSGSR 
PRTCI I WDLNK LS FLTQI*RGHRAP VSAI*C JNELTGDI VSCAGT Y 
IHVWS INGNPI VSVNTFTGRSQQI ICCCMSEMNEWDTQNVI VTG 

j HS DG WR FWRM E FLQVPETP APE PAE VLEMQED CPE AQ I GQEAQ 
DEDS SDSEADEQS I SQDPKDTPSQPSSTSHRPRAAS CRATAAKC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNPI EVRNYS RLKPG YRWERQIiVFR S KLTMHTAFDRKDNAHF A 
EVTALG IS KDHSRILVGDSRGRVFS WSVSDQPGRSAADHW VKDE 
GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFQSEIKRLKI 
SSPVRVCQNCYYNLQHERGSEDGPRKTC 


5466 


3 


992 


HACAHASAHASGRL VRWWRKRRS VMG IQTS PVLLASLGVGL VTL 
\ LGLAVGSYLVRRSRRPQVTliLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGL 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AI LKVP ED PTQC FIiLFANQTEKD 1 1 LREDLEE LQAR Y PNRFKLW 
FTLDHPPKDWA YS KGFVTADM IREHL ? APGDDVLVIaLCGPPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GEALRVGTRGCRRDLPDPQARIFIQKKDLEEDESVTAAHLKSRG 
RSPRKI DQFCNSSNMVHGSVTFRDVAI DFSQEEWECLQPDQRTL 
YRDVMLBN YSHL I S1AGS3 1 SKPDVT TLLEQEKEPWMWRKETS 
RRYPDLELKYGPEKVS PENDTSEVNIjPKQVIKQ I STTLGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKMISYEKLPTHTPHASLICNT 
HKP YECKECGKYFSCGSNL IQHQS IHTGEKPYKCKSCGKAFQI/H 
rQLTRHQKFHTGEKTFECKECGKAFNLPTQLNRHKNIHTVKKLF 
ECKECGKS FNRS SNLTQHQS I HAG VKP YQCKE CGKAFNRGSNLI 
QHQKIHSNEKPFVCKECGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHQKIHTGEKPFECRECGKAFSIaLNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQLIEHSRIHTO 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKPF 
ECKECX5KAFRXHMHLIRHQKLHTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 " 


225 


2976 ! 

< 

J 

1 ] 


S FLTDL FQS LAQLENLCKQLY ETTDTTTRLQAE KALVEFTNS PD _ "' 

CLSKCQLLLERGSSSYSQLLAATCLTKLVSRTNNPLPLEQRIDI ' 

RNYVLN YLATR PKLATFVTQAIiIQL YAR ITKLGWFDCQKDDYVF 

RNA ITDVTRFLQDS VE YC I IG VT I LS QI*TN BINQ VSATAFL I EA 

DTTHPIiTJOTOKT ICCPonccT xst^tott onm r isr\T> h^iatt » r 
*** u a iuinivi/io o r KUdaiic 1J1 r i JjoL^JjLiIVUAooKNIjNDND 

esqhgllmqllklthnclnfdfigts tdessddlctvq i pts wr 
safldsstlqlstigrceyektcaliivqlfdqsaqsyqellqsa 
saspmdiavqegrltwlvyiigaviggrvsfastdeqdamdgel 

VCR VLQLMNLTDS RIiAQAGNEKLELAML S FFEQFRK I YIGDQ VQ 
KSSWLYRRLSEVIiGLNDETMVLSVFIGKIITNIjKYWGRCEPITS 
KTLQLLNDLS IG YSS VRKLVKLSAVQFMLNNHTSBHFS FLGINN 
3SN^TDMRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPLTAAFE 

waqmfstnsfneqeakrtlvglvrdlrgiafafkaktspmmlf 

3WI YPSYM P ILQRAIEliWYHDPACTTPVLKLMAELVHNRSQRLQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted encT 
nucleotide 
location 
corresp ond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
(A*=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H«Histidine, I=*Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R~Arginine, 
Sofierine, ^Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
J \=possible nucleotide insertion) 








1 FDVSSPNGIIiLFRETSKMITMYGNRILTLGEVPKDQVYALKLKG 
IS I CFSMLKAALSGS Y VNPGVFRLYGDDALDNALQTPI KLLIiS I 
PHSDLLDYPKLSQSYYSLLEVLTQDHMNFIASIiEPHVIMYIIiSS 
IS EGbTALDTM VCTGCCS CLDH I VT YIjFKQIiSRS TKKRTTPLNQ 
ES DRFLiH I MQQHPEM I QQMLS TVLN III FEDCRNQWS M S R P t*LG 
LI LLNEKYFSDLRNS IVWSQPPEKQQAMHLCFENLMEG IERNLL 

J TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 - 


2653 


1 DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
EPTCVSDYMSISTCEWKMNGPTNCSTELRLLYQLVFLLSEAHTC 
VPENNGGAGCVCHLtiMDDVVSADNYTLDLWAGQQLIjWKGS FKPS 
EHVKPRAPGWLTVHTNVSDTLLiTWS.VPYPPDNYLYNHLTYAVN 
I WS ENDPADFR I YNVTYLE PSLR I AASTLKSG IS YRARVR AWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CYVS I TKIKKE WWDQI PNPARSRLVAI I IQDAQGSQWEKRS RGQ 
E PAKC PHWKNCLTKLLP CFLEHNMKRDEDPHKAAKEM PFQGS GK 
SAWCPVEISKTVLWPESISWRCVELPEAPVECE3EEEVEEEKG 
S FCAS P ESS RDDFQEGREG I VARLTE S IiFLDLLG BENGG FCQQD 
MGE S CLLPPS GS TS AHMPWDE F PS AG PKEAP PWG KEQP LHLE PS 
PPAS PTQSPDNLTCTBTPLVIAGNPAYRSFSNSLSQSPCPREU3 
PDPLLARHliEEVEPEMPCVPQIiSEPTTVPQPEPETWEQI LRRW 
LQHGAAAAP VSA PTSGYQE FVHAVEQGGTQASA WGLG P PGEAG 
YKAFS SliLAS S AVSP E KCG FGAS SG EEGYKP FQDL I PG C PGDP A 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEO^TX)PI,VDSIX3SGIVYSALTCHLCX3ITriKQCHGQEDGG 
QTPVMASPCCGCCCX3DRASPPTTPLRAPDPSPGGVPLEASLCPA 
SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACRIRTSUJRG IAAVKKDAVEMLAS YGLAYSLMKFFTGPMSDF 
KNVGLVFVNSKRDRTKAVLCMWAGAIAAVFHTL i aysdlgy yi 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
S FLVGCAS ISD VIAQ WFVAILLHS HLECRE PI»I*I P I LiSLYMGA 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPLA 
LIIaATQRISRPIVNLFVSRDLGGSSAATEAVAlLTATYPVGHMP 
YGWLTE IRAVYPAFDKNNPSNKLVSTSNTVTAAHI KKFTFVCMA 
LSLTLC FVMFKTPNVSEK I L I DI IGVDFAFAELCWPLRI FS FF 
P VPVTVRAHLTGWLMTLKKTFVLAPS SVLRI IVL IASLWLPYL 
G VHGATLGVGS LLAGFVGESTMDAI AACYVYRKQ KKKMENE S AT 
EGEDSAMTDMPPTEEVTDI VEMREENE 


547 1 
""5472 " 


1B68 


658 


RSS AP PG PQRAAAATAAAAAAG VEMAAAAAQGGGGGE PRRTEG V ' 
GPG VPGE VEMVKGQ PFDVG PR YTQLQ Y I GEGA YGMVS SAYDHVR 
KTRVAI KKI S PFEHQTYCQ RTIiRE IQ I LLRFRHENVI G I RD I LR 
AS 1XBAMRDVY I VQDLMETDIj YKLLKS QQLSNDH I C YFIi YQ I LR 
GLKYIHSANVLHRDDKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTEYVATRWYRAPEIMIiNSKGYTKSlDIWSVGCILAEMLS 
NRPIFPG KH YLDQLNHI LG ILGSPSQEDLNCT INMKARNYLQS L 
PS KTKYAWAKLFPKSDS KALDLLDRMLTFNPNKR I T VEEALAH P 

YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFQ 
PGVLEAP 




1469 


753 


Jj YVMAR YliSDEE VA VS I DRXCKANGRS PS IPFGTVRIPGRArW - 
DPQALW I FG YGS L VWR PDFA YSDSRVG FVRGYSRR FWQGDT FHR 
GSDKMPGRWTLLEDHEGCTWGVAYQVQGEQVSKALKYLNVREA 
v x u x ivci v xr i^yiJAPDQPIjKAIiAYVATPQNPGYLGPAPEEA 
IATQILACRGFSGHNLEYLLRVRDVMQLCGPQAQDEHLAAIVDA 
VGTMLPCFCPTEQALALV 


54 73 


3 


2119 


FMNVKIiLIQDIiEDIEQRVPVMJDAQYKIITKTAHLITKESPQEBG^ 
KEMFATMSKIiKEQLTKVKEC YS PLL YESQQLL I PLEELEKQMTS 
FYDSLGKINEIITVLEREAQSSALFKQKHQELLACQENCKKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRLQRQIADIHVAFQSMVKK 
TODWKKHVErNSRLMKKFEESRAELEKVLRIAQEGIiEEKGDPEE 
LLRRHTE FFSQLD QRVLNAFIiKACDELTD I LPEQEQQGLQEAVR 
KLHKQWKDLQGEAP YHLIiHLKI DVEKNRFLASAEECRTELDRET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor respondi ng 
to first 
amino acid 
reoiduc of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H«Histidine, I^Isoleucine, K-Lysine, 
L= Leucine, MoMethionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KX,MPQEGSEKirKEHRVF7SDKGPHHLCEKRLQLIEELCVKIiPV 
KDPVRDTPGTCHVTLKELRAAIDSTYRKLMEDPDKWKDYTSRFS 
EFSSWISTNETQIiKGIKGEAIDTANHGEVKRAVEEIRNGVrKRG 
ETLS WLKS RL KVLTE VS S ENEAQKQGDELAKLS s s fkal vtlls 
EVEKMLSNFGDCVQYKEIVKNSLEELISGSKEVQEQABKILDTE 
NL FEAQQLLLHHQQKTKR I SAKKRDVQQQ I AQAQQGEGGLPDRG 
HEBI.RKLESTLDGLERSRERQERRIQVTLRKWERFETNKETVVR 
YLFQTGSSHERFLSFSSLESLSSELEQTKEFSKRTESIAVQAEN 
LVKEAS B I PIiGPQNKQL LQQQAKS I KEQ VKKLEDTLE BE YVI DK 
S 


5474 


2 


7B0 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKS3WLLRQSTI 
LKRWKKNWFDLWSDGHHYYDDQTRQNIEDKVHMPMDCINIRTG 
QECRDTQPPDGKSKDCMLQIVCRDGKTISLCAESTDDCLAWKFT 
LQDSRTNTAYVGSAVMTDETSWSSPPPYTAYAAPAPEVGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI IRERYRDNDSDLALGMLAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSSPCLIiHSPETFIHTMPPNLTGYYRF 
VSQ KNMEDYLQALNI SIAVRKIALLLKPDKE I EHQGNHMT VRTL 
5TFRWYTVQFDVGVEFEEDI»RSVDGRKCQTIVTWEEEHI»VCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRKVR 


547S 


192 


1457 


SDSMSL1»DCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWS'R ' 

PS TRAS E VLCSTNVSHYE LQVEIGRG FDNI/TS VHLARHT P TGTL 

VTIKlTNLENCWEERLKAJbQKAVILSHFFRHPNITTYWTVFTVG 

SWLWVISPFMAYGSASQLLRTYFPEGMSETLIRNI^FGAVRGLN 

YLHQNGCIHRSIKASHILISGDGLVTLSGLSHLHSIiVKHGGRHR 

AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 

ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 

SGVDSGIGESVLVSSGTHTVNSDRLHTPSSKTFSPAFFSLVQLC 

LQQDPEKRPSASS1.LSHVFFKQMKEESQDSILSLLPPAYNKPSI 

SLPPVLPWTEPECDFPDEKDSYWEF 


5477 


3 


1044 


RGNSRliRYSHEDEIiQLPRLPELFETGRQIiLDEVEVATEPAGSRI 
VOEKVFKGLDLLEKAAEMLSQLDLFSRWBDLEEIASTDLKYLLV 
PAFQGALTM KQVN PS KRLDHLQRAREH F I N YJJTQ CHC YHVAE F 3 
LPKTMNNSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 
HRLS AM KS AVESGQADDER VRE YYLLHLQR W IDI S L EE I E£ IDQ 
ElKIIiRERDSSREASTSNSSRQERPPVKPFILTRNMAQAKVFGA 
GYPSLPTMTVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLtlRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KWRIWVPNVKGESTVFRAHTATVRSVHFCSDGQSFVTASDDKT 
VXVWATHRQKFI* FSLSQHINWVR CAKFSPDGRZjI VSASDDKTVK 
IiWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQH YQLHSAAVNGLS FHPSGN YL I TASSDSTLKILDL 
M EGRLL YTLHGHQG PATTVAFS RTGEY FASGGSDBQVMVWKSNF 

DIGDHGBVTKVPRPPATLASSMGNLTVSILEQRLTIiEEDKLKQC 
LENQQLIMQRATP 


5479 
5480 " 


2 


835 


KT VRI WVPNVKGEST VFRAHTATVRS VKFCS DGQS F VTASDDKT 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLrVSASDDKTVK 
LWPKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVV7 
DVRTHRLLQH YQLHS AAVNGLS FHPSGNYI»I TASSDSTUCILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
i/Avjunoii v i tvv f t*f vAi LiMatjMCjfWLTVS I LEQRIjTLEEDKLKQC 
LENQQL I MQRATP 




444 


1952 

3 
I 


LSLTSRMEEAiSDVKGRIjQAlTDKRKIQEBISQKRLKIEEDKLKH 
QHLKKKALREKWIjLDGISSGKEQEEMKKQNQQDQHQIQVLEQSI 
LRLE^IQDLEKAEI^ISTKEEAILKKLKSIERTTEDIIRSVKV 

ereeraeesiediyanipdlpksyipsrlrkbineekeddeqnr 

KALYAME I KVEKDLKTGESTVLSS I PLPSDDFKGTX3I KVYDDGQ 
KSVYAVSSNHSAAYNGTDGIiAPVEVEELLRQASERNSKSPTEYH 
3PVYANPFYRPTTPQRETVTPGPNFQERIKIKTOGLGIGVNESI 
-INMGNGLSEERGNNFNHI SP1 PPVPHPRSVIQQAEEKLHTPQKR 
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oily 

ID 

NO: 


freaictea 
beginning 
nucleotide 
location 

ourrcbpunuiuy 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CaCysteine, D=Aspartic Acid, E=* 
Glutamic Acid, Fs Phenylalanine, G-Glycine, 
H»Histidine, Islsoleucine, K»Lysire, 
L^Leucine, M»Methionine, N=»Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T° Threonine, V-Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWEESNVMQDKDAPSPKPRLSPRETIFGKSEHQNSSPTCQE 
DEE DVR YNT VHS LP PD I NDTE P VTMI FMG YQQAED S E EDKK FLT 

GYDGIIHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPIiPRKRSEASPHEKHKS 


5481 


3 


1422 


NSPGSVCLCQCVCPSLLHCLPPLLLIiLLLPLLLHES PQPPAijRV ' 
VATSSDRNFMNKHQKPVLTGQRFKTRKRDEKEKFEPTVFRDTLV 
QG LNEAGDDLEAVAKFLD STGSR LD YRR YADTLFD I L VAGSM LA 
PGGTRIDDGDKTKMTNHCVFSANEDHETIRNYAQVFNKLIRRYK 
YLE KAFEDEM KKLLLFLKAF S ET EQTKLAMLS GILLGNGTLPAT 
1 LTS LFTDS LVKEG I AAS FAVKLFKAWMAE KDANSVTSSLRKAN 
LDKRLLELFPVNRQSVDHFAKYFTDAGLKEIiSDFLRVQQSLGTR 
KELQKE LQER LSQECPI KEWLYVKEEMKRNDLPETAVI GLLWT 
C I MNAVEWNKKEELVAEQALKH LKQYAP LLAVFSS QGQS ELI LL 
QKVQEYCYDNIHFMKAFQKIWLFYKADVLSEEAILKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


5482 


1492 


528 


THVVMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPSRNLSLRL 
EGLQEKDSGP YSCS VNVQDKQGKSRGHS IKTLELNVLVPPAPPS 
CRLQGVPHVGANVTLS CQSPRSKPAVQYQWDRQLPS FQTFFAPA 
LD VI RGS LSLTNLS S S MAGVYVCKAHNEVGTAQCNVTLEVSTG P 
GAAVVAGAVVGTLVGLGLLAGL VLL YHRRG KALE EP AND IKEDA 
I A PRTLP WP KS SDTI S KNGTLS S VTS ARALR P PHGP PR PGALT P 

TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


*483 


1 


788 


FFFFKGCRAGRGNESDYRKLEEMHQRFLVSER5KDDLQLRLTRA 
ENRI KQLBTDS SEE I SR YQEM IQKLQNVLBS ERENCGLVSEQRL 
KLQQENKQLRK3TESLRKIALEAQKKAKVKI STMEHE FS I KERG 
FEVQLR EME DSNRNS I VELRHLLATQQKAANRWKE ETKKLTES A 
FIRINNLKSELSRQKLHTQELLSQLEMANEKVAENEKLILEHQB 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 


3 

• 


1997 


IMADMEDLFGSDADSEAERXDSDSGSDSDSDQENAASGSNASGS ~ 

ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 

SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 

AEGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 

LQN S DDDE KMQNTDDEE RPQLSDDERQQ LS E E E KAN S DDE R P VA 

SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSEAOSDTEVPKD 

NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 

PIP ETR I EVE I P KVNTDLGNDLYFVKL PNFLS VE PRPFD PQY YE 

DEFEDEEMLDEEGRTRLKJLKVENTIRWRIRRDEEGNEIKESNAR 

IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 

VFKTKLTFRPHSTDSATHRKMlTiSLADRCSKTQKIRILPMAGRD 

PECQRTEMIKKEEERLRASIRRESQQRRMREKQHQRGLSASYLE 

PDRYDEEEEGBESISIiAAIKNRYKGGIREERARIYSSDSDEGSE 

EDKAQRLjUCAKKLTSDEVRPKLFNSRGLSCTQEPTALNEELTDQ 
AGTN 


5485 


161 


1074 


KRKILSSMMDSEAHEKRPPILTSSKQDISPHITNVGEMKHYLCG 
CCAAFWNVAI TFP IQKVLFRQQL YG I KTRDAILQLRRDG FRNL Y 
RGILPPLMQKTTTLALMFGLYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTTEAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 
GE Y YRGL VP I LFRNGLSNVLFFGLRGP I KEHLPTATTHSAHLVN 
DFICGGLLGAMIiGFL FFp I NVVKTRI QSQ IGGEPQS FPKVFQKI 
WLERDRKL INL FRGAHLN YHR SL I S WG 1 1 NAT YE FLLKV I 


5486 


1404 


142 


I PGST IS WSPAAARGLSVCRCCRLHPASAMDLFGDLPEPERSPR 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GSLATSISQMVKTEGKGAKRKTSEEEKNGSEELVEKKVGKASSV 
I FGLKGY VAERKGEREBMQDAHVI LNDITEECRPPSSLITRVS Y 
FAVFDGHGGI RASKFAAQNLHQNL I R KFPKGDVI S VE KTVKRCL 
LDTFKHTDEEFLKQASSQKPAWKDGSTATCVLAVDNILYIANLG 
DSRAI LCRYNEBSQKHAALS LS KEHNPTQYEERMRIQKAGGNVR 
DGRVLGVLEVSRS IGDGQYKRCGVT5 VPD I RRCQLTPNDRFI LL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, B=* 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=»Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M^Metbionine, N^Asparagine, 
P=Proline f Q^Glutamine, R=Arginine, 
S«serine, T»Threonine, V=Valine, 
W -Tryptophan, Y=Tyrosine / X -Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGIiFKV FTPEEA VNF I LS CLE D EKI QTREGKS AADAR YEAAC 
NRLANKAVQRGSADNVTVMWRIGH 


5487 


535 


182 


AVSLSQIRGLQTPAPVPIjPLQPCPSNCDMERVTLALLLIAGLTA 
LEANDPFANJCDDPFYyDWKNLQLSGLICGGLLAIAGIAAVIiSGK 
CKCKSSQKQHSPVPEKAI PLITPGSATTC 


5468 


1072 


259 


AMAASGB PQRQWQEE VAAWWGSCMTDLVS LTSRLP KTGETIH 
GHKFFIGFGGKGANQCVOAARLGAMTSMVCKVGKDSFGNDYIBN 
LKQNDI STE FTYQTKDAATGTAS 1 1 VNNEGQNI IVIVAGANLLL 
NTEDLRAAANVI SRAKVMVCQLE ITPATSLEALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAEILTGLTVGSAADAGE 
AALVLLKRGCQ Wl 1TLGASGCWLSQTEPEPKHI PTEKVKAVD 
TTVSFKI 


5489 


81 


893 


GKGPVAAFIDQSNIFLTDFXIFIiGQWREEPKMPUiLLGETEPLK 
LERDCRSPVEPWAAASPDLAtACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS I AIRKKQQE WGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDSF 
FSAKEENI I YS FLGIAPPPDSKGSEKA3EGGETEAQKEGSEDVG 
NLPEAQEKNEE EGETATEETSE I AMEGAEGEAEEEEBTAEGESP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNI FLTDPKI FLiGQ W R E E P KM P IaL LLG ETE P LK""~ 
LERDCR S PVE P V7AAAS PDIiAIjACLCH CQDLS SGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS IAIRKKQQE WGFLEANKXDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGI PbPPQI FNEEQYCGDFDSF 
FSAKEENI I YS FLGLAPPPDS KGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEBEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRLSLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE "" 
QGDMAS S FLiP AGA I TGDSGG E LS SGDDS G E VE FPHS PE IE ETE C 
LAELFEKAAAHLQGDIQVASREQLLYLYARYKQVKVGNCNTPKP 
SFFDFEGKQKWEAWKADGDSS PSQAMQE YIAWKKLDPGWNPQ I 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNIFDYCRENNIDH 
ITKAIKS KNVDVNVKDEEGRALLHWACDRGHKEIiVTVUjQHRAD 
INCQDNEGQTALH YASACE FLDI VELLLQS GADPTLRDQDG CIiP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


ASKNPLSAVCTTGIMSSIoAVRDPAMDRSLRSVFVGNtPYEATEE 
QLKDIFSEVGSWSFRLVYDRETGKPKGYGFCEYQDQETALSAM 
RNLNGRE FSG RALRVDNAAS E KN K E ELKS LGPAAP 1 1 DS P YGDP 
IDPEDAPES ITRAVASLPPEQMFELMKQMKIiCVQNSHQEARNML 
LQNPQLAYALLQAQ WMRIMDPE IALKILHRKIHVTPIiI PGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNQQNPPAPQPQEIARRPVKDI 
PPLMQTP IQGGI PAPGP I PAAV PGAGPGSLTPGGANQ PQLGMPG 
VGP VPLERGQVQMSDPRAPI PRGP VTPGGL PPRGLMDAPNDPR 
GGTLLSVTGEVEPRGYLGPPHQGPPMHHASGHDTRGPSSHEMRG 
GPLGDPRLL IGEPRG PM I DQRGL PMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGIQGPGPINIGAGGPPQGPRQVPGI SG VGNPGAGMQGTG I Q 
GTGMQGAGIQGGGMQGAG IQGVSIQGGG IQGGGIQGASKQGGSQ 
P SS FS PGQSQVTPQDQEKAAIi IMQ VLQLTADQ I AML P PEQRQS I 
LILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMT KAV PEE PRKPGRLTQALNS P LTWEHVW I CVPGGT PDCL 
TDTFR VKRPHL RRSASNGHVPGTPVYRE KEDM YDEI IELKKSLH 
VQKSDVDLMRTXLRRLEEENSRKDRQIEQIjLDPSRGTDFVRTLA 
E KRPDAS WVI NGLKQR I LKIiEQQCKE KDGTI S KLQTDMKTTNLE 
EMRIAMETYYEEVHRLQTLLASSETTGKKPLGEKKTOAKRQKKM 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTI5KTQGYVEW 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 
QAKADLEKEI/ECAREGEEERREREBVLREBIQTI.TSKLQELQEM 
KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 
RS P CS DGR RDAAARVLQAQ WKVYKH KKKKAVLDEAAVVLQAA FR 



332 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleobide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G- Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
u— wuviiits , i w i — i ic uuxcjiiiiie , iM s ^vsparagx tic , s 
P=Proline, Q=Glutamine, R=Arginine, [ 
S=Serine, T=Threonine, V-Valinc, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 

* r / — pu^OAWAC liUUlcDLlQc QclcLlOn , 

\opossible nucleotide insertion) | 








GHhTRTKLlAS KAHGS EPPS VPGIiPDQS S P VPR VPS P I AQATGS "1 

P VQ E EA I V 1 1 QS ALRAHLARARHS ATGKRTTTAAS T RRR S ASAT 

HGDASSPPFIAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKtfFPV 


5494 


71 


536 


RSKAKIGTPTREVPSTDMKVRRESSSSLTHRPAPSPATPRLLGT"" 
RRVLLGVS EGTGCADAMELVLVFLCS LLAPMVLASAAE KE KEKD 
PFHYDYQTLRIGGLVFAWLFSVGIIiLIIjSRRCKCSPNQKPRAP 
GDEEAQVEN1»I TANATEPQKAEN 5 


5495 


273 


2168 


DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG 

ELRPASLWLPRSLAPAFERFCQVNTGPLPLIiGQSEPEKWMLPP 

QGA I S ETRMGHPQFWK YEFGACTGSItASI>EQ YS EQLKDM VAFFL 

GCSFSLEEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCCP 

LVVTMRPIPKDKLEGLVRACCSIiGGEQGQPVHMGDPELLGIKEL 

S KPAYGDAMVCPPGE VP VFWPS PLTSLGA VSS CETPLAFAS I PG j 

CT VMTDL KDAKAP PGCLTPE RI PEVHHI S QD PLH YS IAS VS ASQ 

KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 

GFPTHFNHEPPEETDGPPGAVALVAFLOAIiEKEVAIIVDQRAWN 

LHQK I VEDAVEQG VLKTQ 1 P I LTYQGG S VEAAQAFLCKNGD PQT 

PRFDHLVAIERAGRAADGNYYNARKMNIKHIiVDPIDDLFLAAKK 

IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 

VIAGVSNWGGYALACALYILYSCAVHSQYLRKAVGPSRAPGDQA 

WTQALPS VI KEEKMLG I LVQHKVRS GVSG I VGME VDGLP FHNTH 

AEMIQKLVDVTTAQV 


5496 


3 


2408 


QDT KMHE I YKGNI TPQIJ* KNTLKTSAATDVWAVYFSQFW IDYSG 
MKSGKGRP IS FVDS FPLS I W ICQPTRYAESQKE PQTCNQVSLNT 
SQSBS S DLAGR L KR KKLLKE Y YS TESE PI/TNGGQKPSSS DTFFR 
FSPSSSEADlKLLVHVHKHVSMQINHYQYLLLLFLHESLILliSE 
NLRKDVEAVTGS PASQTS I C I G I LLRSAELALLLH P VDQANTLK 
SPVSESVSPWPDYIiPTENGDFLSSKRXQISRDINRIRSVTVNH 
MSDKRSMSVDI^HIPLKDPLLFKSASDTNIiQKGlSFMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYR3 
DSNILSFDSDGNQNIIiSSTLTSKGNETIESIFKAEDLliPEAASL 
S ENLD I SKEET P P VRTLKSQS S LSGKP KERC P PN1APLCVS YXN 
MKRS S SQMS LDT IS L D S M I LEEQLLiES DGS DSHMFLE KGNKKNS 
TXWYRGTAES VNAGANLQNYGETSPDAI STNSEGAQENHDDLMS 
WVFKITGVNGE IDI RGEDTE ICLQVNQVTPDQIiGNISLRHYLC 
NRP VGSDQKAV I HS KSS PE I S LR FESGPGAVIHS LLAEKNG FLQ 
CHI KNFSTEFIiTSSLMNIQHFLEDETVATVMP^3KI QVSNTKIWL 
KDDS PRS S T VSLE PAP VTVH IDHI* WERS DDGS FHI RDS HMLNT 
GNDLKENVKSDSVLLTSGKYDLKKQRSVTQATQTSPGVPWPSQS 
ANFPEFSFDFTREQLMEENESLKQEIiAICAiCMALAEAHLEKDALL 
HHIKKMTVE 


5497 
5498—- 


1821 


3308 


SISKLLKRRSNIDAYLLSNSCAFFAPRIiFStASQIIREQQSPNV 
CFI YK YSG FPS LECQCHFVS PHS SCYINFFS Fp P P FFVCFQL SN 

GFSHYSTjSQTTQHVnDTrSar'T X?ntJr*r nnnn T T nm rmnirrrr I 

v7coniOAJoacanviji' < nji« a ij£ PHCIjPASRLLPRVTSVHLPDYAH 1 
Y YTIGPGMFPSSQ I PS WKDWAKPGPYDQPLVNTLQRRKEKRE PD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
IiALSRGLQLDTQRSSRDStiQCSSGYSTQTTTPCCSEDTIPSQVS 
DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
S TAGLPTTLGPAMVTPG VAT I RRTPSTKPS VRRGT IGAGP I P I K 
T P VI P VKTPT VPDL PG VL PAP PDGP EERGEHS PES P S VGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPS I PBEHRQAIPBSEAEDQER 
EPPSATVS PGQI PESDPADLS PRDTPQGEDMLNAIRRG VKLKKT 
TTNDRSAPRFS 




2434 


1492 

< 
I 
] 
I 


I LTHQE I F TGiSKfc»CECGKAS IQMSH LSQQKI YSGENPFACKVCG 
KVFS HKSNIiTEHEH FHTRE KP FE CNECGKAFS QKQ YV I KHQNTJH 
rGEKIiFECNECGKSFSQKENLIiTHQKIHTGEKPFECKDCGKAFI 
3KSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 
?FKCS ECGTAFGQKKYI. I KHQNIRTGBKP YECNECG KAFSQRTS 
Lj I VH VRIHSGDK P YECNV CGKAFS QS SS LTVHVRSHTGE KP YGC 
^ECGKAFSQFSTLALHI^IHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spond i ng 
to first 
amino acid 
resiuue OJC 
amino acid 
sequence 


[A^Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, Ko Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P~Proline, Q=Glut amine, R=Arginine, 
S=Serine, T« Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 

QKIHTH — 


5499 


324 


926 


GFGQIGRGHKITTYPFSPRKSGRKGMAQSQGWVKRYIKAFCKGF 
FV AVP VAVT FLDR VAC VAJR VEGAS M Q? S LNP GGS QS S D WLLNH 

WKVRNFEVHRGDIVSIiVSPKNPEQKIIKRVIALEGDIVRTIGHK 
NRYVKVPRGH 1 WVEGDHHGHSFDSNS FGPVSLGLLHAHATH I LW 
PPERWQKLESVLPPERLPVQREEE 


5500 


1978 


1286 


KPDW RLQNL p prl ylwrss r fg fghl kkrlqmdfki eht w dg FP " 

VKHEPVFIRLNPGDRGVMMDISAPFFRDPPAPLGSPGKPFNELW 
DYEWEAFFLNDITEQYLEVELCPHGQHLVLLLSGRRNVWKQEL 
PLS FRVS RGETKWEGKAYIj p ws yf p PNVTKFNS fai hg s kdkrs 

YEALYPVPQHBLQQGQKPDFHCLBYFKSFNFNTLLGEEWKQPSS 
DLWLIEKCDI 


5501 


2927 


2226 


crppvsarvApghqgavggsgrrparvewdaaarpssrpfslp 

AAIMLALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGQFSEDMI PTVGFNMRKVTXGNVTI KI WDIGGQPRFRSMWERY 
CRG VNAI VYMIDAADREK1 EASRNELHNLLDKPQLOGI PVLVLG 

NKRDLPNALDBKQLIEKMNLSAIQDREICCYSISCKEKDNIDIT 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFPVWVPERTAiLTCPLGAAPGSSREAPGIAGPPNSTJWSKL 
GKFFKGGGS SKS RAAPS PQ EALVRLRETEEMLG KKQE YLENR I Q 
REI ALAKKHGTQNKRAALQALKRKKRFE KQLTQ I DGTLST I EFQ 
REALENSHTNTEVXRNMGFAAiCAMKSVHENMDLNKIDDIWQEIT 
EQQDIAQEISEAFSQRVGFGDDFDEDELMAELEELEQEELNKKM 

TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
IKQLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFLLPKLTSKKEVDQAIKSTA 

EKVLVLRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 

AVYTQYFDISYIPSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVKS 


5504 


58 


3563 

] 


QLSFSFQAPVTFDDITVYLLQEEWVLLSQQQKEIiCGSNKLVAPL 

GPTVANPELFRKFGRGPEPWLGSVQGQRSLLEHHPGICKQMGYMG 

EMEVQGPTRESGQSLPPQKKAYbSHLSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSXRDKRSRL 

I EG YTGP FKVETLKYHAKS KAHMFCVNALAARDPI WAARFRS IR 

DPPGDVLASPEPLFTADCPIFYPPGPLGGFDSMAELLPSSRAEI, 

EDPGGDGAI PAM YLDCISDLRQKEXTDQ IHSSSD INI L YWDAVE 

SCIQDPSAEGLSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 

QKEL YR D VMRMNYELLAS l^PAAAKPDL I S KLERRAA PW I KDPNf 

GPKWGKGRPPGNKKMVAVREADTQASAADSAIiliPGSPVEARASC 

CSSSICEEGDGPRRIKRTYRPRS IQRSWFGQFPWLVlDPKETlGb 

FCSACI ER PNLHDKS SR I» VRG YTG PFKVETLKYHB VS KAHRL CV 

NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 

DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 

VRNS PCVS VLLDSSTDASEQACVGI YIR YFKQMEVKES Y1TLAP 

LYSETADGYFETIVSALDELDIPFRKPGWWGU3TDGSAMLSCR 

GGLVEKFQE VI PQLLPVHCVAHRIiHIAWDACGS I DLVKKCDRH 

IRXVFKFyQSS^RLNELQEGAAPLBQEIIRLKDLNAVRWVASR 

RRTLHALLVSWPALARHLQRVAEAGGQIGHRAKGKLKLMRGFHF 

VKFCHFLLD FLS I YRP LS E VCQKE I VL I TE VNATLGRAYVALES 

LRHQAGPKEEEFNASFKDGRLHGICLDKLEVAEQRFQADRERTV 

LTG IE YLQQRFDADRP PQLKNME VFDTMAWPSGIELAS FGNDDI 

LNLARYFECSLPTGYSEEALLEEWLGLKTIAQHLPFSMbCKNAL 
AQHCR FPLLS KLMAVWCVP I S TS CCERGFKAMNRI RTDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
ZAQVPARS PAS ARLR ECEEKGAL YVEE PRTQKP P I LP S REAABVL 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 1 
< 
] 
1 


^csprslsaakt^isnrnnnklpsnlpqi^nlikrdppayiee'flq 

2YNHYKSNVEIFKLQPNKPSKEIiAEZ,VMFMA<2ISHCYPEYLSNF 
?QE VKX>LLS CNHT VLDPDLRMTFCKAL ILIjRNKNIiINPS S LLEL 
'FELFRCHDKIiliRKTLYTHIVTDIKNINAKHKNNKVNVVLQNFK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D^sAspartic Acid, E«s 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine, 
PoProline, Q=Glutamine, R*=Arginine, 
S= Serine, T»Threonine, V=Valine, 
W^Tryptophan, Y= Tyro sine, X=Unknown, *-stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAAKMSLDVMIELYRRNIWNDAKTVNVITTACPSK 
VTKI LVAALTPFLGKDEDSKQDSDS ESEDDGPTARDLLVQ YATG 
KKSSKNKKKIiEKAMKVIiKKHRKKKKPEVFNFSAIKLIHDPQDFA 
EKLLKQLECCKERFEVKMMLMNLISRliVGIHELFLFWFYPFLQR 
FLQPHQRBVTKI LLFAAQASHHLVPPEI IQSJbLMTVANNFVTDK 
NSGEVMTVGIWAIKEITARCPIiAMTEELLQDLAQYKTHKDKNVM 
MSARTLIHL FRTLNPQMLQKKFRGKPTEAS I EARVQE YGE LDAK 
DYI PGAE VLEVSKEENAEKDEDGWESTSLSEEBDADGEW IDVQH 
SSDEEQQEISKKLNSMPMEERKAKAAAISTSRVLTQEDFQKIRM 
AQMRKELDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERIjHKK 
P KS D K ETRLATAMAGKTDRKE FVRKKTKTNP FS SSTNKE KKKQ K 
| NFMMMRYSQNVRSKNKRSFREKQLAIiRDALLKKKKRMK 


5506 


l 


1531 


FRGDI.CGQRGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CSGGGG EET PGTTPVW S PLEGGGDEELRPNP YVRFP YRWWAVW 
LAAFPSI^AGGETPEAPPESWTQLWFFRFWNAAGYASFMVPGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPLAPRT 
EAAETT PM WQAL KLL FCATGLQ VS YLTWGVLQERVMTRS YGATA 
TS PGER FTDSQFI*VIjMNRVIjAIjI VAGIjSCVTjCKQPRHGAPMYRY 
SFASLSNVLS SWCQYEALKFVS FPTQVIAKAS KVI PVMLMGKLV 
SRRSYEHWEYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAG Y I AFDS FTSNWQDAIiFA YKMSS VQMMFG VNFFS CIjFTVGSL 
LEQGALLEGTRFMGRHSEFAAHALLLS ICS ACGQLFIFYTI GQF 
GAAVFTIIMTLRQAFAILLSCLLYGHTVTWGGLGVAWFAALL 
1 LRVYARGRLKQRGKKAVPVESPVQKV 


5507 


3704 


1271 

! 


1 PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLEIGGFGTAAGKK " 
VA VAD VQFGPMRFHQDQLQVLIj VFTKEDNQ CNG FCRACE KAG FK 
CTVTKE AQAVIAC FLDKHHD I III DHRNPRQLDABALCRS IRSS 
KJ.SENTVIVGWRRVDREELSVMPFISAjGFTRRYVENPNIMACY 
NELLOLEFGEVRSQLKLRACNSVFTALENSBnAISITSEDRFlQ 
YANPAFETTMGYQSGELIGKELGBVPINEKKADLIjDTINSCIRI 

gkewqgiyyakkkngdniqqnvkiipvigqggkirhyvsiirvc 

NGNNKAEKISE CVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 
EVSSQRRHSSMARIHSMT I EAP ITKVINI INAAQESSPMPVTEA 
ZjDRVLEILRTTELYSPQFGAKDDDPHANDLVGGMSDGLRRLSG 
NEYVLSTKNTQMVSSNI ITPI SLDDVPPRIARAMENEEYWDFDI 
FELEAATHNRPIjIYLGLKMFARFGICEFLHCSESTLRSWLQIIE 
ANYHSSNPYHNSTHSADVLHATAYFLSKERIKETiDPIDEVAAL 
IAATIHDVDHPGRTNS FLCNAGSELAIL YNDTAVLESHHAALAF 
QLTTGDDKCNI FKNMERND YRTLRQG I I DMVLATEMTKHFEHVN 
KFVNSINKPLATLEENGETDKNQEVINTMIjRTPENRTLIKRMLI 
KCADVSNPCRPI^YCIEWAARISEEYFSQTDEEKQQGIjPVVMPV 
FDRNTCS 1 PKSQIS FIDYFITDMFDAWDAFVDLPDLMQHLDNNF 
KYWKGLDEMKLRNLRPPPB 


5508 


1151 


£91 


LSSVFSRRSASMFAVGCSMGPFIjHYWYIiSLDRLFPASGLRGFPN ~~ 
VLKKVLVDQLVASPLLGVWYFLGLGCLEGQTVGBSCQEIiREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
K YRS P VPLTPPG C VALDTRAD 


5509 


1238 


619 | 


RKSRGCQNAIjSASGPAAAAAAI MVRKLKFHEQKLLKQVDFLN WE" " 
VTDHNLHELRVLRRYRLQRREDYTRYNQLSRAVRELARRLRDLP 

erdqfrvrasaalldklyalglvptrgslelcdfvtassfcrrr 
lptvllklrmaqhlqaavafveqghvrvgpdwtdpafi>vtrsm 

CiUrviwv s> K. ± kkhvIiE YNEERDDFDLiEA 


5510 


96 


1195 

j 

I < 
1 . 


PAGAHI/SSGSSEPLVEPGRGRVGARVKGERGLQASGSAPGRSra^ 

AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWKRSQAY 

ADYIGF I bTLNEGVKGKKLTFEYRVSEAIEKLVALLNTLDRW ID 

ETPPVDQPSRFGNKAYRTWYAKLDEBAENLVATWPTHLAAAVP 

EVAVYLKESVGNSTRrDYGTGHEAAFAAFLCCIiCKIGVLRVDDQ 

lAIVFKVFNRYLEVMRKCiQKTYRMEPAGSQGVWGLDDFQFLPFI 

WGSSQIilDHPYLEPRHFVDEKAVNENHKDYMFIiECILFITEMKT 

3PFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 

KFGSLLPIHPVTSG 
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SEQ 
ID 

NO: 



5511 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



276 



5512 



120 



5513 



5514 



1295 



5515 



1572 



"5517" 



246 
~~ 3 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1980 



Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G°Glycine, 
| H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N*Asparagine, 
P=Proline, Q°Glutamine, R=Arginine, 
I S- Serine, T« Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide inserti on) 
KL3RVLNLPPENLITSISAVPISQK)E iEVADFQLSVDSLLEKDND ' 



"1015" 



837 



449 



260 



"T35 - 



499 



1375 



HSRPDIQVQAKRLAEKLRCDTWSEISTGQRTVNFKINRELLTK 
TVLQQVI EDGS KYGLKSELFSGLPQKKI WEFSS PNVAKKFHVG 
HLRST 1 1 GNF IANLKE ALGHQVIRINY LGDWGMQ FGltbGTGFQX) 
FGYEEKLQSNPLQHLFEVYVQ VNKEAADDKS VAKAAQE FFQRLtE 
LGDVQALSLWQKFRDLSIEEYIRVYKRLGVYFDEYSGESFYREK 
SQEVLKLLESKGLhLKTIKGTAWDLSGmDPSSXCTVMRSDGT 
SLYATRDLAAAIDRMDKYNFDTMIYVTDKGQKKHFQQVFQMLK1 
MGYDWAERCQHVPFGWQGMKTRRGDVTFLEDVLNEIQLRMLQN 
MAS 1 KTTKELKNPQETAERVGLAALI IQDFKGLIiLSDYKFS WDR 
VFQSRGDTGVFliQYTHARLKSLEETFGCGYLNDFNTACLQEPQS 
VS I LQKLLR FDEVLYKSSQDFQPRHI VS YLLTLSHLAAVAHXTIi 
Q I KDS P PE VAGARtiHLFKAVRSVIiANGMKLLG ITPV CRM 
DPSLLIiTITVTGVTVLVLVLKSMNSRRRBP ITLQDPEAKYPIiPir 
I E KEKI S HNTRR FRFGLPS PDH VLGLPVGNYVQ LLAKI DNELW 

RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQYIiENMK 
IGETIFFRGPRGRLFYlIGPGNLGIRPDQTSEPKKTLADHbGMIA 
GGTG I TPMLQL I RK I TKD PSDRTRMSL I FANQTEED I LVRKELE 
EIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STL I LVCG PP PL IQTAAH P NLB KLG YTQDM I FTY 

ARWRLP5D5PRIPPAGAE TPGRGSCRNYLPSSSPPPPEPSSFPS" 
PPTSRGGPGSRDTMSDSEEESQDRQLKIWLGDGASGKTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQIWDIGGQTIG 
GKMLDKYI YGAQGVLLV YDITK YQS FENLEDWYTWKKVS EESE 
TQPLVALVGNKIDLEHMRTIKPEKHLRFCQENGFSSHFVSAICrG 
DS VFLCFQKVAAEIIiG IKLNKAEI EQSQR WKAD I VNYNQEPMS 
RTVNPPRSSMCAVQ 

WRPSWIMGNFRGHALPG^FFFIIGLWWCTKSILKYICKKQIWT - 

CYLGSKTLFYRLEILEGITIVGMALTGMAGEQFIPGGPHLMLYD 

YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTISSLPVSLTKL 

MLSNALFVEAFIFYNHTHGREMLDIFVHQLLVLVVFLTGLVAFL 

EFLVRNNVLLELLRSSLILLQGSWFFQIGFVLYPPSGGPAWDLM 

DHENILFLTICFCWHYAVTIVIVGMNYAFITWLVKSRLKRLCSS 

EVGLLKNABREQESEEEM 

FVRLVGRGDCDPLLSVCLTTMPLVEG LGSGGEKTAWIDLGEAF 
TKCGFAGETGPRCI I PS VI KRAGMPKPVR WQYNINTEELYS YL 
KEF IHI LYFRHLLVNPRDRRWI IESVLCPSHFRETLTRVLFKY 
FEVPS VLLAPSHIMAXLTLGINS AMVLDOG YR ES LVLPI YEG I P 
VliNCWGALPI^GKALHKEIiETQLIiEQCnVDTSVAKEQSLPSVMG 
SVPEGVLEDIKARTCFVSDLKRGLKIQAAKFWIDGNNERPSPPP 
NVDYPLDGEK1LHILGSIRDSWEILFEQDNEEQSVATLILDSL 
IQCP IDTRKQLAENLWIGGTSMLPGFLHRLLAE IRYLVEKPKY 
KKALGTKTFR1HTPPAKANCVAWLGGAI FGALQDILGSRSVSKE 
YYNQTGRIPDWCSLNNPPLEMMFDVGKTQPPLM KRAFSTEK 
NSREPPQAGPGPSPRKSPTASSFLFPWR PLAS3FWMGAQGAQES ' 
IKAMWRVPGTTRRPVTGESPGMHRPEAMLLLLTLALLGGPTWAG 
KMYGPGGGKYFSTTEDYDHEXTGLRVSVGLLLVKSVQVKLGDSW 
DVKLGALGGNTQE VTLQPGE YITKVF VAFQAFLRGMVMYTS KDR 
YFYFGKLDGQISSAYPSQEGQViVGlYGQYQLLGI KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR 

SEIYVAMRTDSSKMTDVESGVANFASSARAGRRNALPDIQSSAA 
TDGTS PL PLKLBALS VKEDAKE KDEKTTQDQL EKPQNE EK 
DAWADAW VRAWDLNMDF PCL W LG LLLP L VAAL DFN YHRQEGME A ' 
FLKTVAQN YSSVTHLHS IGKS VKGRNLWVLWGRFPKEHR1G I P 
EFKYVANMHGDETVGRELLLHLIDYLVTSDGKDPEITWLINSTR 
I H I MP SMNPDGF EAVKKPDC Y YS I GRENYNQ YDLNRNF PDAFE Y 
NWVSRQPETVAVMKWLKTETF VLSANLKGGALVAfl YPFDNGVQA 
TGALYSRSLTPDDDVFQYLAHTYA5RNPNMKKGDECKNKMNFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
SF^1?^?NNKASLIEYIKQ^r^LGVKGQVFPQNGNPLP^AAIVEVQDRK 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location, 
corresponding 
to first 
amino acrid 
residue of 
amino acid 
sequence 


freaxcted end 

nucleotide 

location 

j.cjeapvynui.ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, DcAspartic Acid, 
Glutamic Acid, FePhenylalanine, G^Glycine, 
H=Hisciaine, I«Isoxeucine, K= Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P«= Proline, Q -Glut amine, R«Arginine, 

W^Tryptophan, Y=»Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HICPYRTNKYGEYYLLLLPGSYIINVTVPGHDPHITKVIIPEKS " 

QNFSALKKDILLPPQGQLDSIPVSNPSCPMIPLYRNLPDHSAAT 

KPSLFLFLVSLLHIFFK 


5519 


87 


til 


I KS KLNQQVEVQESEWRLTEAKGPTMGKESGWDSGRAAVAAWG 
G WAVGT VLVALS AMGFTS VG XAAS S IAAKMMSTAAI ANGGGVA 
AGSLYAI LQS VGAAGLS VTSKVI GGFAGTALGAWLGS P PSS 


5520 


117 


Q A 1 


PTEGRQKVLKTFXVi«KSALAMTKTSTCIYHFLVLSWYTFLNYYI 
SQEG KDE VKP KILANGAR WKYM TL LNLLLQTI F YG VTCLDD VL K 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFWILFLYNRDL 
IYPKVLDTVIPVWLNHAMHTFIFPITIAEVVLRPHSYPSKKTGL 
TLLAAAS IAYISRILWLYFETGTWVYPVFAKLSL1LGI1AAFFSLS 
YVF IAS I YLLGEKLNHWKW VSVQ ILQRWRLES VGIC FQW PDWKS 
PAKHQLVKNIR 


5521 


546 


911 


K I LNMQ KS CEEN EG KPQNM PKAE El t)R PLEDVPQE AEGN PQP SE E 
GVSQEAEGNPRGGPNQPGQ3FKEDTPVRHLDPEEMIBGVDELEK 
LREE I RRVRNKFVMMHWKQRHSRSRP YPVCFRP 


" 5522 


1224 


63 7 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQGS TDHRGV PGKPGR WTLVED PAG C VWG VAYRLFVGKEE BVK 
AYLDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYLG 
PAPLEDIAEQIFNAAGPSGRNTEYLFELANSIRNIjVPEKADEHL 
FALE KLVKE RLEGKQNLNC I 


5523 




1280 


S KGKKRMGS SMSAATARRPVFDDKEDVNFDHFQ I LRA1GKGSFG 
KVC I VQKRDTE KM YAM KYMNKQQC I ERDE VRNVFRELE I LQ E I E 
HVFLVNLWYSFQDEEDMH-IWDLLLGGDLRYHLQQNVQFSEDrV 
RLYI CEMAIiALDYLRGQHI I HRD VKPDNI LLDERGHAHL TD FNI 
ATIIKDGERATALSGTKPYMAPEIFHSFVKGGTGYSFEVDWWSV 
GVMAYEI^LRGWRPYDIHSSNAVBSIiVQIjFSTVSVQYVPTWSKEM 
VALLRKLLTVWPEHRIjSSLQDVQAAPAIiAGVLWDHLSEKRVEPG 

fvpnkgpjlhcdptfeleemilesrplhkkkkriaknksrdnsrd 
ssqsendylqdcldaiqqdfvifnreklkrsqdlpreplpapes 
rdaae p vedeaersalpmcg pi cpsagsg 


5*24 "" 


85 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMELRCGGLLFSSRFDSG 
NLAHVEKVESLS SDGEGVGGGASALTSGI ASS PDYE FNVWTRPD 
CAETE FENGKRS WF YFS VRGGMPGKL I KINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWERIRDRPTFEMTETQFVLSFVHRFVEGRGA 
TTFFAFCYPPSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
ELLCYSLDGIiRVDLLTITSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKRIFFLSSRVHPGETPSSFVFNGFLDFILRPDDPRAQTLR 
RLFVFKLIPMLNPDGWRGHYRTDSRGVNLNRQYLKPDAVLHPA 
IYGAKAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDLEKAN 
NLQKEAQCGHS ADRHNAE AWKQTEPAEQiKLNS VW I M PQQS AGLE 
ESAPDT I PPKESGVAYYVDLHGHASKRGCFMYGNS FSDESTQVE 
NMLYPKLISLNSAHFDFQGCNFSEKNMYARDRRDGQSKEGSGRV 
A I YKASGI IHS YTLE CN YNTGRS VNS I PAACHDN GRAS PPP PPA 
FPS R YTVE LFEQVGRAMAI AALDMAE CNP W PRI VLS EHSS LTNL 
RAWMLKHVRNSRGr,SSTLNVGVNKKRGLRTPPKSHNGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGIi 
PGLGSSTQ KVTHRVZ/jPVRGKPVWEPLQHVFGCLGHCWGK 


5525 


105 


834 


SNTLDFERHJjFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEEFLGRVAELN1)VTAKVASGQEKHIiLFEVQPGSDSSAFWKV 
WRWCTKINKSSGIVEASRIMNLYQFIQLYKDITSQAAGVLAQ 
SSTSEEPDENSSSVTSCQASLWMGRVKQIiTDEEECCICMDGRAD 
LI LP CAHS FCQKC I DKWSDRHRNCPI CRLQMTGANES WWSDAP 
TEDDMANYILNMADEAGQPHRP 


5526 


3 


853 


ItKPLNF VRAAKRTGAAARA PRGLE VTMLR VAWRTLS LI RTRAVT 5 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDPPPSTLLKDYQNVPGIEKVDDWKRLLSLEMANKKEMLKI 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEBHLEKKRKDK 
ftHKRYLLMSIDQRKKMLKNLRNTNYDVFEKlCWGLG IE YTFP PL 
yYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 
NO: 


Predicted 
beg i nni ng 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=» Proline, Q=Glut amine, RaArginine, 
S=Serine, TeThreonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRN9DSPAKAI PKTLKDSQ 


5527 


3225 


565 


LLR X YLLHQN PLLLRHQ PNRTCI S FS ATMKLKDTKS RPKQSS CG 
KFQTKGIKWGKWKEVKIDPNMPADGCMDDLVCFEELTDYQLVS 
PAKNPSSLPSKEAPKRKAQAVSEEEEEBEGKSSSPKKKIKLKKS 
KNVATEG TS TQ KE FE VKD PELEAQGDDMVCDDPEAGEMTS ENL V 
QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWIPEVHDQKADVS 
AWKDLFVPRP VLRALS PLGFSAPTP IQALTLAPAIRDKLDILGA 
AETGSGKTIiAFAIPMIHAVXjQWQKRNAAPPPSNTEAPPGETRTE 
AGAKTRS PG KAEAESDALPDDTVI ES EALPSD I AAEARAKTGGT 
VSDQALL PGDDDAGEGPS SLIRE KPVPKQNEWEEENLDKEQTGN 
liKQELDDKSATCKAYPKRPLLGLVLTPTRELAVQVKQHIDAVAR 
FTG 1 KTAILVGGMSTQKQQRMLNRRPE I WATPGRLWELI KEKH 
YHLRNLRQLRCLVVDEADRM\raKGHFAELSQLLEMLNDSQYNPK 
RQTIiVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKPKVI DLTRNEATVE TLTETK IHCETDE KDF YL Y YFLMQYPG 
RS LVFANS 1 S C I KRLS GLLKVLD I M PLTLH ACMHQKQRLRNLEQ 
FARLEDCVLLATDVAARGLDIPKVQHVIHYQVPRTSEIYVHRSG 
RTARATNEGLSLMLIGPEDVINPKKIYKTLKKDEDIPLFPVQTK 
YMD WKERI RLARQ I E KS E YRNFQACLHNSW I EQAAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRtiliI»SQPLFTESQKTK 
YPTQ/SGKPPLLVSAPS KSESALS CLS KQKKKKTKKPKEPQPEQP 
QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLAT IS I TLRRYLRLGATMAKSKFE ' 
Y VRD FEADDTCLAHCW VVVRLDGRWFHRFAEKHNFAKPNDS RAXt 
QLMTKCAQTVMEELED I VI AYGQSDEYS FVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLI*YPPGFDGRVWYPSNQT 
LKDYIiSWRQADCHXNNLYNTVFWALXQQSGLTPVQAQGRIiQGTL 
AADKNE ILFS EFN1N YNNE PPM YRKGTVLI WQKVDEVMTKE IKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5529 


48 


640 


TFRLVSAHLKTRKLINPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPR I MEEKALE vydlirti rdpe 
KPNTLEELEWSESCVEVQEINEEEYIiVIIRFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKMKLEIYISEGTHSTEEDINKQINDKERV 
AAAMENPNLRE I VEQCVLE PD 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDI*KPENVVFFEKQGX>VKIiTDFGFSNK 
FQPGKKLTTS CG S LAYS AP E I LLGD E YDAP AVD I WSLGVI LFML 
VCG QPPFQEANDS ETLTMIMDCK YTVPSHVSKECKDLI TR MLQR 
DPKRRASIiEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
SIIQRMVLGDlADRDAIVEALETNRYNHITATYFLLAERIIiREK 
QEKEIQTRSA5PSNIKAQFRQSWPTKI0VPQDLEDDLTATPLSH 
ATVPQSPARAADSVLNGHRSKGLCDSAKKDDLPE LAG PALS TVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVLNQlFEEGESDDEFDWDENIiPPKLSRLKWNI 
AS PGTVHKR YHRRKSQG RGS S CSSS ETS DDDS E SRRRLDKDS GF 
T YS WH R RDSS EG P PGS EGDGGGQS KPSNASGG VDKAS PSENNAG 
GGS PS S GSGGNPTNTSGTTRRCAGP S NSMQLASRS AGEL VE SL K 
LMSLCLGSQLHGSTKYI IDPQNGLS FS S VKVQEKS TWKMCI SST 
GNAGQV PAVGG I KF FSDHMADTTTEL ER I KS KNLKNNVLQL PLC 
EKTISVNIQRNPKEGLLCASSPASCCHVT 


| 5531 


24 


515 


GSOPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGTVLFARLF 
ALEPDLLPLFO YNCRQFSS PEDCT »SS PRFrjDH TP in/MT ,vt na &w 
TNVEDL S SLE EYLASLGR KHRAVGVKLS S FS TVGES LLYMLEKC 
LGPAFTPATRaA WS QL YG AWQAMSRG WDGE 


S532— " 


3395 


1402 


SDWMWGKRKMIlEDETEFCGEELLHSVLQCKSVFDVliDGEEMR ' 
RARTRANPY^IRGVFFLNRAAMKMANMDFVFDRMFTNPRDSYG 
KPLVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFBPYYGEGGIDGDGD1TRPENISAFRN 
FVLDNTDRKGVH FLMADGG FS VEGQENLQEI LS KQLLLCQFLMA 
LS IVRTGGHFICKTFDLFTPFS VGLVYLL YCCFERVCLFKP ITS 
R PANS ER YWCKGLKVG IDD VRD YL FAVN I KLNQLRNTDSDVNL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid r E= 
Glutamic Acid, F^Phenylalanine, G-Glycine, 
H=Histidine, I=lsoleucine, K>-Lysine, 
L=Leucine, M=Methionine, N=Aeparagine , 
P^Proline, Q -Glut amine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKALAfCIHAFVQDTTlT 
SEPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKPPELIQGTEI 
DIPSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDR W I KLDL KTE LPRDTLLS VE I VHEL KGEGKAQRK1 
SAI H I LDVLVLNGTDVREQHFNQRI QLAEKFVKAVSKPS RPDMN 
P I R VKE VYR LEEMEK I FVR LEMKI I KGSSGTPKLSYTGRDDRHF 
VPMGLYIVRTVNEPWTMGFSKSPKKKFFYNKKTKDSTFDIjPADS 
I AP FH I CYYGRLFWE WGDG I RVHDSQKPQDQDKLS KEDVLS FIQ 
MHRA 


5533 


94 


7B9 


MKERRAPQ P WARCKLVLVGD VQCG KTAMLQVLAKDCY PE TY VP 
TVFEN YTACLETEEQRVELSLWDTSGS P YYDNVRPLC YSDS DAV 
LLiCFD i SRPETVDSALKKWRTEI LDYCPSTRVLL I GCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
HS I FRTASM LCLN KPSPL PQKS P VR SLS KRLLHLP S RS EL I S PT 
FKKEKAKXCSIM 


5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVU3ALEAK 
TGVEKRYLAAGAVTLLSLYIiLFGYGASLLCNIilGFVYPAYASIK 
AIESPSKDDDTVWLTYWVVYAIiFGIiAEFFSDLLLSWFPFYYVGK 
CAFLLFCMAPRPWNGALMLYQRWRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDSEARIiCSLVELSDTQDETQKSDSENEDIiKIDCLQESQEL 
NLQKLKNSERILTEAKQKMRELTVNI KMKEDL I KE L I KTGNDAK 
S VS KQ YTLKVTKLEHDAEQAKVELTE tqkqlqelenkdls dvam 
KVKLQKEFRKKVDAAKCjRVQVIjQKKQQDSKKIiASL»SIQNEKRAN 

eleqsvdhmkyqkiqlqrklqeenekrkqldavikrdqqkikvi 

LSYIPAKYNMKC 


5536 


942 


282 


AAATAAS LS PRGCRLRT PS SDVSPSRAPPPSAAPLPrGRAQMSP 
SGRLCLLTIVGLILPTRGQTLKDTTSSSSADATIMDIQVPTRAP 
DAVYTELQPTSPTPTWFADETPQPQTQTQQLEGTDGPIjVTOPET 
HKSTKAAHPTDDTTTLSERPSPSTDVCyrDPQTLKPSGFHBDDPF 
F n?EHTLRKRGIjL VAAVLF ITGX 1 1 LTSGKCRQ LSRLCRNHCR 


5537 


3 

, i 


2391 


RARVSSPQLRVFRSGRPRRLRVbRINRTSVALRLAGTGRFVAlCt 

PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNIiYQDVMLEN 

YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 

YNKDLLTEHCTBASFQKVISRRHGSCDLENLHLRKRWKREECEG 

HNGC YDEKTFKYDQFDES S VES LFHQQ I LS SCAKS YN FDQ YRJCV 

FTHSSLLNQQEEIDIWGKHHIYDKrSVLFRQVSTLNSYRNVFtG 

EKNYHCNNS EKTLNQS SS P KNHQENYF LEKQ YKC KE F3EVFLQS 

MHGQEKQEQS YXCNK CVE VCTQS LKH I QHQTIHIRENS YS YNK Y 

DXDLSQSSNLRKQIIHNEEKPYKCEKCGDSLNHSLHLTQHQIIP 

TEEKP YKWKEGG KVFNLNCSL YLTKQQQ I DTGENL YKCKACS KS 

FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYIiTKHKRiKTG 

E KPYKC KECG KAFNRS S CLTQHQTTHTGEKL YKCKVCS ICS YAJR S 

SNLIMHQR VHTGEKP YKCKECGKVFSRS SCLTQHRKIHTGBNLY 

KCKVCAKP FTCFSNIj I VHERIHTGEK P YKCKECG KAF? YSSHL I 

RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 

CGKAFSYSSDVIQHRRIHTGQRPYKCEECX5KAFNYRSYLTTHQR 

SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 

FS YRS YLTTHRRSHS GERPYKCEECGKAFNSRS YL IAHQRSHTR 

EKL 


5538 


926 


161 


x to ri i u t i t\ a * oxr V Ltl v lljL)LtXjLi\jLi±. Di. oQAQLSCTG PPAI PGI PG 

I PGTpG PDGQPGTPG I KGEKGL PGLAGDHGE FGE KGDPG I PGN P 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDQTIRFDHVI TNMNWNYEPRSGKFTCKVPGI/YYFTYHA 
SSRGNLCVNLMRGRERAQKVVTFCDYAYNTFQVTTGGMVLKLEQ 
3 ENVFLQ ATDKNS L LG MEGANS I FSGFLLFPDMEA 


5539 


38 


1258 


HRGPSGAAAPGCALPRGQAIiEGPRSCRRPQPMARRYDELPHYPG 
IVDGPAALASFPETVPAVPGPYGPHRPPQPLPPGIiDSDGLKREK 
DE I YGHPLF PLLAL VFE KCELATCS PRDGAGAGLGTP PGGDVCS 
SDS FN ED I AAFAKQ VRS ER PL FS SNPELDNL VI QAI Q VLRFHLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cys teine, D=Aspartic Acid, E*= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I-isoleucine, K=Lysine, 

Leucine, M=Methionine, N^Asparagine , 
P^Proline, Q=Glutamine, R^Arginine, 
S=Serine, T«Threonine, V= Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *<=Stop 
Codon, /=possible nucleotide deletion, 
\=po9sible nucleotide insertion) 








ELEKVHOLCDNFCTRYITCLKGKMPIDLVIEDRDGGCREDFEDY 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTSVASPSSGGEDEDLDQERRRNKKRGIFPKVATNIM 
RAWLFQHI*SHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPI/GAGAGVHARSPHPAKRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALBGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKRBKDEI 
YGHPLFP LLALVFE KCELATCS PRDGAGAGLGTPPGGDVCS SDS 
FNEDNTAFAKQVR S 3RPLFSSNPEUDNLMIQA1QVLRFHLLELE 
KGKMPIDLVIEDRDGGCRBDFBDYPASCPSLPDQmiHIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQE PRRNK KRG I F PKVATN I MRAWL FQHLSHP Y P SEEQKKQ 
LAQDTGLT I LQ VNNW F I NARRR I VQPM I DQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


143 


1440 


P PLG AGAG VHARS PH PARRL P LTTAGVGGRAPDLLPT PWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPL F PLLALVFE KCEUYTC S PRDGAGAGLGTP PGGDVCS SDS 
FNEDNTAF AKQVRS ERPLFS SNPELDNLMI QAI QVLRFHLIjELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGS VHLGTPG PSSGGLASQSGDNS SDQG VGLDTS VASPSSGGED 
EDLDQEPRRNKJKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
I*AQDTGl,TIIiQVNNWFINAKRRIVQPMIDQSNRTGQGAAFSPEG 
QP IGG YTET3PHVAFRAPAS VGDE FGTRKEEWHYL 


5542 


148 


1440 


PPIiGAGAGVHARSPHPARRLPIiTTAGVGGRAPDLLPTPWRQHRG" 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
G PAALAS F P ETVPAVPGP YG PHR PPQPLP PGLDSDGLKREKDE I 
YGH P LF PLLAL VFEKCE LATCS PRDGAGAGLGTP PGGDVCSSDS 
FNBDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KG KM P IDL V I EDRDGG CREDFED YPAS CPSLPDQNNI W IRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEBQKKQ 
LAQDTGLT I LQVNNWFINARRR I VQPMIDQSNRTGQGAAF3 PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5543 
" 5544 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP 
KRPFSDSGAFWSPERRPG VDEAPRRR PVPAS FRAVPPKPTR VHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KBSRARRGPRGPSAFIPVEEVLREGAESLEQHLGLEALMSSGRV 
DN1AVVMGLHPDYFTSFWRLHYLLLHTDGPLASSWRHYIAIMAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSEINK 
LLAHRPWLITKEHIOAIiLKTGEHTWSLABL IQALVLLTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNWSGGF 
ESARDVEALMERMQQLQBSLLRDEGTSQEEMESRFELBKSESLL 
VTPSADILEPSPHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFOAAYSLTYNTIAMHSGV 
DTS VLRRAIWN Y IHCVFG IRYDD YD YGEVNQLLERNLKVY I KTV 
AC YPEKTTRRM YNLFMRHFRHS EKVIIVNIiLLLEARMQAAL L YAL 
RAITRYMT 




1895 


514 


LGGLLGRQRLLLRMGAGRLGAPMERHGRASATSVSSAGEQAAGD 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVESLRKXRPLFPWFGLDIGGTLVKLVYFEPKDITAEEEEEEV 
ES LKS I R KYLTSNVAYGS TG I RDVHLE LKDLTLCGRKGNLHFI R 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELD CLI KGILYI DS VGFNGRS QCY YFENPADS EKCQK 
LPFDLKNP YPLLLVNIGSGVS ILAVYSKDN YKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRD1YGGDYBRFG 
LPGWAVAS SFGNMMSKEKREAVS KEDLARATLI T I TNNIGS 1AR 
^ICALNEN I NQ WF VGJI FLRIUTI AMRLIiAYALD YWS KGQLKALF 
3EHEGYFGAVGALLELLKIP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid . 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DeAspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Hietidine, I-Isoleucine, K=Lysine, 
L»Leucine, M-Methionine, N«Asparagine, 
P=»Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, V- Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *^stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


5545 


802 


131 


gamwsagrggaawpvi>lglllalii,vpgggaaktgaelvfcgsvii 
kllnthhrvrlhsiidikygsgsgqqsvtgveasddansywrirg 
qsegg c prg s p vrcx3qavrlthvi*tgknlhthhf ps plsnnqev 

SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSVPIiSVT 
GSQYGS PIRGQHEVHGMPSANTHNTWKAMEGIFI KPSVEPSAGH 
DEL 


5546 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHS FVFT ~ 

RGCTGRNrRQLSLDVRRVMEPLTASRLQVRKKNTSLKDCVAVAGP 

LGVTHPLILSKTETNVYFKLMRIiPGGPTIjTFQVKKYSLVRDVVS 

SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKIiMATMFQNLFPSI 

NVHKVNLOTIKRCLLIDYNPDSQELDFRHYSIKVVPVGASRGMK 

KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 

AVAGRGNMRAQQS AVRLTE I GPRMTLQLI KVQEG VGEGKVM FHS 

FVS KTEEELQAI LBAKEKKLRLKAQROAQQAQNVQRKQEQREAH 

RKKSLEGMKKARVGGSDEEASG2PSRTASLELGEDDDEQEDDD1 

EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 

CDQ KF PKTKDKSQGAQARRGPRGAS RDGGRGRGRGRPGKR VA 


5547 


1532 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPilSFVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHFLILSKTETNVYPKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLMATMFQNLFPS I 
NVHKVNLNTIKRCIjLIDYNPDSQELDFRHYSIKWPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITEIjPQ 

avagrgnmraqqsavri»teigprmtlqi,ikvqegvgegkvmfks 
fvs kteeelqai leaxekklrlkaqrqaqqaqnvqrkqeqreah 
rkkslegmkkarvggsdeeasgi psrtaslelg edddeqedddi 
eyfcqavgeapsedlfpeakqkrlakspgrkrkrwemdrgrgrl 
cdqkfpktkdksqgaqarrgprgasrdggrgrgrgrpgkrva 


' S548 " 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPIiARALRGNETTA" 
DSN ETTTTSG P PD PGAS QP L LAWLLL PIiIjLLIiIi VTjL-IjAAYF FRF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEEIRIRSADDCKQFREEFNSLPSGHIQGTFEliANKEEN 
REKNRYPWILPWDHSRVJI,SC3LlX5IPCSDYINASYIZX3YiCEKNK 
F I AAQGP KQE TVND FWRMVW EQKS AT I VMLTNLKERKEEKCHQY 
WPDQGCWTYGNIRVC^DCVVIiVDYTIRKFClQPQLPDGCKAPR 
LVSQLH FTSWPD FGVPFT P I GMLKFL KKVKTLNP VHAGP I VVHC 
SAGVGRTGTFI VIDAMMAMMHAEQKVDVFEFVS RI RliQR PQMVQ 
TDMQ YTF I YQALLE YYLYGDTELDVSS LEKHLQTMHGTTTHFDK 
IGLEEEFRKLTNVRlMKENMRTGbnjPANMKKARVlQIIPYDFNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFIATOGPLAHTVEDF*W 
RM IWEMKSHTI VMLTEVQBREQDKCYQ YWPTEGS VTHGE ITI EI 
KNDTLSEA1S IRDFLVTLNQPQARQEEQVRVVRQFHFHGWPE IG 
I PAEGKGMIDLI AAVQKQQQQTGNHP ITVHCS AGAGRTGTF I AL 
SNILERVKAEGLLDVFQAVKSLRLQRPHMVQTLEQYEFCYKWQ 
DFIDI FSDYANFK 


554 9 


915 


256 


FEATGGKRLAFKMAGTARHDREMAIQAKKKLTTATDPIERLRLQ 
C LARGS AG I KGLGRVFR IMDDDNNRTLDFKEFMKGLNDYAWME 
KEE VE ELFQRFDKDGNGT I DFNEFLLTLRPPMSRARKEVI MQAF 
RKLDKTGDGVIT I EDLRE VYNAKHHPKYQNGE WS EEQVFRKFLD 
NFDSPYDKDGLVTPEEFMNYYAGVSAS IDTDVYFI IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTI MEFSVYQDTWMKYE YEVDKDFSSKLRIN I DI 
TVAMKCQ Y VGAD VLDri AETMVAS ADGIiVYE PTV FDT*S FQQKEWQ 
RMLQLI QSRLQEEHSLQDV1 FKSAFKS TSTALPPREDDSSQS PN 
ACRIHGHLYVNKVAGNFHITVGKAIPHPRGHAHLAALVNHESYN 
FSHRI DHLSFGBLVPAI INPLDGTEK I AI DHNQMFQY FIT WPT 
KLHTYK I SADTHQFS VTERBR X INHAAGSHGVSGI FMKYDL5SL 
M VTVTE EHMPFWQFFVRLCG I VGG I FS TTGMLHG I GKF I VE 1 1 C 
CRFRLGS YKPVNS VP FEDGHTDNHLPLLENNTH 


5551 


211 


1700 


MQREHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE'"" 
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~| Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D*Aspartic Acid, 2= 
Glutamic Acid, F«Phenylalanine, G=Glycine r 
J H^Histidine, I=Isoleucine, K=I>ysine, 
L=Leucine, M=Methionine, N=sAsparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T- Threonine, V*,Valine, 
1 W^Tryptophan, Y-Tyrosine, X=»Unknown, *=Stop 
j Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKI PAKRI FGDNFDPDFI K 
i UKKAOijN fc.fr iQNLVRYPELYNHPDVKAFLQMDSPXHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LI^RKLDGKFYAVKVI^KKI\^NRKEQKHIMAERNVLLKNVKH 
PFLVGLHYSFQTTEKIiYFVLDFVNGGELFFHLQRERSFPEHRAR 
FYAAE I ASAIiGYLHS I KI VYRDLKPENILLDSVGHVVLTDFGLC 
K EG I A I S DTTTTFCGT PE YLAPE VI R KQP YDNT VDWWCLG AVI*Y 
EMIj YGLP P F YCRD VAEMYDN I LHKPLSIiR PG VS LTAWS ILE ELL 
EKDRQNRLGAKEDFLE IQNHPFFESLSWADLVQKKIPPPFNPNV 
AG PDD I RWFDTAFTE E TVP YS VC VS S DYS I VNAS VLEADDAFVG 
j FSYAPPSEDLFL 


5552 
' 5553 * 


2746 


930 


J LGPAAGAAMGKKHKKHKAEWRSSYEDYADKPLEKPLKIiVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSBKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
P VRACRTQ PAENE S T P I QQLLEH FLRQLQRKDPHG FFAFPVTDA 
I APGYSMI I KHPMDFGTMKDK1 VANEYKSVTEFKADFKLMCDNA 
^4TYNRPDTVYYKLAKKILHAGFKMMSKQAALLGNEDTAVEEPVP 

1 EWPVQVETAICKSKKPSREVISCMFEPEGNACSLTDSTAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLLYSWNTAEP 
DADEEETHPVDliSSLSSKLLPGFTTLGFKDERRNKVTFLSSATT 
ALSMQNNSVFGDLKSDEMELLYSAYGDETGVQCALSIjQEFVKDA 
GSYSKKWDDLLDQITGGDHSRTLFQLKQRRNVPMKPPDSAKVG 
DTLGDSSSSVLEFMSMKSYPDVSVDISMLSSLGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDQUHIi 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


5554 


74 


1095 


LGREAVYLVSRMDGPVAEHAKQEPFHWTPJbLBSWALSOVAGMP^ 
VFLKCENVQPSG5FKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAA YAARKLG I PAT 1 VL PES TSLQ WQRLQGEGAE VQLTGKVWD 
EANLRAQEIAKRDGWENVPP FDHPLI WKGHASLVQELKAVLRTP 
PGALVLAVGGGGLLAG WAGLLE VGWQH VP 1 I AMETHGAH CFNA 
AITAGKLVTLPDITSVAKSLGAKTVAARALBCMQVCK1HSEWE 
DTEAVS AVQQLLDDERMI*VE PACG AALAA I YSG LLRRLQAEGCL 
PPS LTS VWI VCGGNNINSRELQALKTHLGQV 


5555 ~" 




2318 J 


CSGR TGGRGS1*R PAENV CI/TCK LS GAETRGLIiC PALRTWIMK VL 
GRSFFWVLFPVLPWAVQAVEHEEVAQRVIKLHRGRGVAAMQSRQ 
WVRDSCRKLSGLLRQKNAVLWKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFE 1 FQKELNESENS VFQAVYGLQRALQGDYKDWNMKESSR 
QRLEALREAAI KEETE YMELLAAEKHQVEALKNMQHQNQSLSML 
DEILEDVRKAADRLEEBIEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANSKQNITKREVEDDLGLSMIilDSQNNQYILTKPRDSTIPRADH 
nrx KD I VTIGM LS L PCG WLCTAIGLPTMFGYI I CGVLLGP S GLN 
SIKSIVQVETLGEFGVFFTLFLVGLEFSPEKLRKVWKISI.QGPC 
YMTLLMI AFGLLWGHIiLRI KPTQSVF ISTCLSLSSTPLVSRFLM 
GSARGDKEGD ID YS WLLGMLVTQDVQiiGLFKAVM PTLIQAGAS 
ASSS I WE VLR I LVLIGQI LFSLAAVFLL CL V I KKYLI G P YYRK 
LHMBSKGWKEILILGISAFIFLMLTVTELLDVSMELGCFLAGAL 
VSSQGP WTEE I ATS IEP I RDFLAI VFFAS IGLHVFPTFVAYEL 
TVLVFLTLSVVVMKFLLAALVLSLILPRSSQYIKWrVSAGLAQV 
S3FS FV LGS RARRAGVI SRE VY LLIIiS VTTLSLLLAP VLWRAA t 
TRCVPRPERRSSL 




212 

§835 H 


1425 T* 
J 

J ( 
I 
< 

3346 1 I 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
OTMAPQNLS TF CLLLL YL IGA VIAGRD F YKILG VPRSAS IKD I fC 
KA YRKLALQLHPDRN PDD PQAQ E KFQDLGAAYEVLS DSEKRXQ Y 
DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMFGGTPRQQDRNIPR 
3SDI I VDLEVTLE E VYAGNF VEWRNKPVARQAPG KRKCNCRQE 
^RTTQLGPGRFQMTQEVGCDECPWVKLVWEERTLEVEIEPGVRD 
3ME YP F IGEGEPHVDGEPGDLRFRI KWKHP I FERRGDDLYTNV 
riSI» VESIj VG FEMD I THLDGHKVHXSRDKI TRPOAKL WKKGEGL 
?^DN^IKGSLI ITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 

ITRGMS KNCV PMEFEE YLiIiRM FQGTFYIiLQKI TKDNNAHTVKS R 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, B« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine , M=Me t hionine , N« Asparagine , 
PoProline, Q-Glutamine, R=Arginine, 
S»Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=»possible nucleotide deletion, 
\apossible nucleotide insertion) 








LEELDESYIEKFTDFLRLFVSVHLRRIESYSQFPV^EFLTttFK 
YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLNRYE 
DALVLLLTEVLNRIQFRYNQAQLEELDDETLDDDQQTEWQRYLR 
QSLEWAKVMELLPTHAFSTLFPVLQDNIiEVYLGLCQFIVTSGS 
GHRIiNITAENDCRRLH CSI*RDLSSLLQAVGRLAEYF IGDVFAAR 
FNDALTWERLVKVTLYGSQIKLYNIETAVPSVLKPDLIDVHAQ 
SLAALQAYSHWLAQYCSEVHRQNTQQF VTIi I STTMDAI TPLIST 
KVQDKLLLSACHLLVSLATTVRPVFLIS I PAVQKVFNRI TDASA 
LRLVDKAQVLVCRALSNILLLPWPNLPENEQQMPVRSINHASLI 
SALSRDYRNLKPSAVAPQRKMPLDDTKL I IHQTLSVLEDIVEN I 
SGESTKSRQICYQSLQESVQVSIALFPAFIHQSDVTDEMLSFFL 
T L FRGLRVQMG VPFTE Q I IQTFLNM FTREQLAES ILHEG STGCR 
WEKFLKI LQWVQE PGQ VFKP FLPS I IAL.CMEQVYPI I AERPS 
PDVKAELFELljFRTLHHNWRYFFKSTVIiASVQRGIAEEQMENEP 
Q FS A I MQAFGQSFLQPDI HLFKQNLF YLETLNT KQKLYHKKI FR 
TAMLFQFVNVLIiQVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 
AFLPE FLTS CDGVDANQKS VI*GRNFKMDR VR RERGRAKR RAEWA 
RKPGTCAARRGHIEASGRGLCPPCSIAAAHEMPADLVL 


5S57 


1712 


491 


VIIjGAGIiRDKDMWI PVVGLPRRLRLSAIiAGAGRFCILGSEAATR " 

KHLPARNHCGLSDSSPQLWPBPDFRNPPRKASKASLDFKRYVTD 

RRLAETLAQIYLGKPSRPPHLLLECNPGPGILTQALLEAGAKW 

ALESDKTFI PHLESLGKNLDGKLRVIHCDFFKLDPRSGGVI KPP 

AMSSRGLFKNLGIEAVPWTADIPLKWGWFPSRGEKRALWKIAY 

DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 

WQLACEIKVLHMEP WSS FDI YTRKG PliENPKRRELLDQLQQKLY 

LIQMXPRQNLFTKNLTPMNYNIFFHLLKHCFGRRSATVIDHLRS 

LTPIiDARDILMQIGKQEDSKWNMHPQDFKTLFETIERSKDCAY 

KWLYDETLEDR 


5558 


1509 


96 


ragcthpqvpadlgapaeprrpqktcvcllqpqpggOrg PTTMI 

TGVFSMRLWTPVGVLTSLAYCLHQRRVALAELQEADGQCPVDRS 
LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YT VTNIiAGG PKPYS P YDSQ YHETTLKGGMFAGQI, TKVGMQQMFA 
LGERLRKNYVEDIPFLSPTFNPQEVFIRSTNIFRNLESTRCIjLA 

glfqcqkegp iii htdeadsevlypnyqscws lrqrtrgrrqta 
si,qpgisedlkkvkdrmgidssdkvdffili^nvaaeqahni*ps 
cpmlkrfarmi eqravdts uy1 lpkedres lqmavgp flh i les 

NIjLKAMDSATAPDKIRKLYL yaahdvtfi PLLMTLGI FDHKWPP 

favdltmelyqhleskewfvqlyyhgkeqvprgcpdglcpldmf 
lnamsvytlspekyhalcsqtqvmevgnee 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDI DSLLETLS PEEMEELEK 
ELDWDPDGSVPVGIiRQRNQTEKQSTGVYNREAMLl^FCEKETKK 
LMQREMSMDESKQVETKTDAKNGEERGRDASKXALGPRRDSDLG 
KEPKRGGLKKSFSRDRDEAGGKSGEKPKEEKIIRGIDKGRVRAA 
VDKKEAGKDGRGEERAVATKKEEEKKGSDRNTGLSRDKDKKREE 
MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMICKEDEK 
VKRGTGNTDTKKDDEKVKKNEPLHEKEAKDDSKTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPS I FDEPLERVKNNDPEMTSVNVNNSDC 
ITNEILVRFTEAI^FNTVVKLFALANTRADDHVAFAIAIMLKAN 
KTITSLNLDSNHITGKGILAIFRALLQNNTLTELRFHNQRHICG 
GKTEME I AKLL KENTTLLKLG YHFELAGPRfTrVTNLLS RNMDKQ 
RQKRLQEQRQAQBAKGEKKDLLEVPKAGAVAKGSPKPSPQPSPK 
r-o *• xum a if wu^flfArtv w j* t» e fxJ\ P t*Xj l MEWiiKWSLiSPATQRKM 
GDKVLPAQEKNSRDQLIAArRSSNLKQLKKVEVPKLLQ 


5560 
5561™ 


9 

2175 


921 
1775 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEGFLSAEECVAM 
QQR IGE I VAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFLSSGDK 
IRFFFEKGVFDEKGNFLVPPEKSINKIGHAIiHAHDPVFKSITHS 
FKVQTlARSIiGLQMPWVQSM YIFKQPHFGGE VS PHQDAS FL YT 
EPLGRVLGVWI AVEDATLENGCLWF I PGSHTSGVSRRMVRAPVG 
S APGTS FLGS EPARDNS L F VPTPVQRGALVLIHGEWHKSKQNL 
SDRSRQAYTFH1.MEASGTTWSPENWI,QPTAELPFPQLYT 
C YFI FQ FF SSi? Y PGIjHPHQT PAPLPNPGLYPPP VSMS PGQP PPQ 
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Ammo acid segment containing signal peptide 
{A^Alanine, C^Cysteine, DaAspartic Acid, E» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, X=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q=G1 ut amine , R=Arginine, 
S=Serine, T=Threonine, Va Valine, 
K=Tryptophan, Y«Tyrosine, X*=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








QLLAPTYFSAPGVKNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
Q VYG G VT Y YNPAQQQ VQ P KPSPP RRT PQP VTI KP PP PE WSRGS 

S i 


5562 


342 


1385 


SSGKNDMAAAGAAGI,VRGIiKAGVLSQAOYLN^ 

LQSTD YGN FLANEAS PLTVSVIDDRtiKEKMWE FRHMRNHAYEP 

LASFLDPITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 

QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISBQDLDEMNIEI 

IRNTL YKAYLES FYKFCTLLGGTTADAMCP I LE FEADRRAFI IT 

INS FGTELS KEDRAKL FPKCGRL YPEG LAQLARADD Y EQVKNVA 

DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKIiAFIiKQFHF 

GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDmriPIF | 


5563 


342 


1385 


S5 GKNDMAAAGAAGL VRGLfCAG VLS QADYhNLVQCETLBDL KLH 
LQSTDYGNFIiANFJVSPLTVSVIDDRI*KEKMVVE FRHMRNHAYEP 
LAS FLDFIT YS YMIDNVI LLI TGTLHQRS IABLVP KCHPLGS PE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDEMNIEI 
I RNTLYKAYLES FYKFCTLLGGTTADAMCP ILEFEADRRAFI IT 
INSFGTEDSKEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKIJCiFEGAGSNPGDKTLEDRFFEHEVKLNKIiAFIiNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIP | 


5564 


3 


914 


KVRRD KRA VWTARGRRR CGDSMSGG WMAQVGAWR TGALGLALLL ' j 
LLGLGLGLEAAAS PLSTPTSAQAAGPSSGSCP PTKFQCRTSGIjC 
VPLTWRCDRDLDCSDGSDEEECRIBPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKIiRNCSRLACIAGBLRCTLSDDCIPLTWRCDGH 

pdcpdssdelgcgtneilpegdattmgppvtlesvtslrnattm 
gppvtlesvpsvgnatsssagdqsgsptaygviaaaavlsaslv 
tatlllds wlraqerlr plgllvamkeslllseqktslp ! 


5565 


993 


138 


RWNSPNPARAGSISRPQRAPGSVSAVAMTAAVFFGCAFIAFGPA { 
LALYVFTIATEP3LRIIFL1AGAFFWLVS1jLISSI>VWFMARVI ID 
NKDGPTQKYLLI FGAFVS VYI QEMFRFAY YKLLKKASEGLKS IN 

pgetapsmrllayvsglgfg IMSGVFS f vntlsdslg pgtvg IH 

GDSPQFFLYSAFMTLVIILLHVFWGIVFFDGCEKKKWGILLIVL 
LTHLLVSAQTFISS YYGINLASAFI ILVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFIiLYNQRSR | 


556$ 


2043 


1232 


SHIQHHGRGAQAP VKMVS WM I SRAWLVFGMLYPAYYSYKAVKTj 
KNVKEYVRWW^YWIVFALYTVXETVADQTVAWFPLYYELKIAFV 
I WLLS P YTKGASL I YRKFLH PLL SS KERE I DDY I VQAKERGYE T 
MVNFGRQGLNLAATAAVTAAVKSQGAI TERLRSFSMHDLTTI QG 
DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 
RPQVYF 1 


5567 


1554 


233 


E FLGS G VS PDI4ANEDGI1TALHQCCIDDFREM VQQIf LEAGANI NA 1 
CDSECWTPLHAAATaSHLHLVELilASGANLIiAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGITQDSIEAARAVPELRMLDDIRSRLQ 
AGADIjHAPLDHGATIiLHVAAANGFSE AAALLLEHRAS LS AKD QD 
GWEPlJiAAAYWGQVPLVEIiVAHGADLNAKSLMDETPLDVCGDE 
EVRAKLIiELKHKHDALLRAQSRQRSLLRRRTSSAGSRGKWRRV 
SLTQRTDLYRKQHAQEAI VWQQPP PTS PEPPEDNDDRQTGAEIiR 
PP PPEEDNPEWRPHNGRVGGS PVRHLYS KRLDRS VS YQLSPLD 
S TT PHTLVHDKAHHTLADLKRQRAAAKLQRP P P EGPES PBTAE P 
GIrfQpfEVTPQPVCG FRAGGDP PliLKLTAPAVEAP VERRP CCLLM 


5568 
5569 


1731 
2 


587 

1 

] 
i 

835 [ < 


AEDRQPASRRGAGTTAA>IAA<;r?Pf3r , PQt<yr , T r , Di?uncB'rpEwt*T -r j 

SLLVSGPRLFLLQQ PLAPSGLTIj KS E ALRNWQ VYRI.VT Y I FVYE 
NPISLLCGAIIIWRFAGJJPERTVGTVRHCPFTVIFAIFSAIIFL I 
SFEAVSSLSKLGEVEDARGFTPVAPAMI.GVTTVRSRMRRALVFG 1 
MWPS VLVPWLLLGASWLIPQTS FLSNVCGLS IGIAYGLTYCYS 
IDLSERVAiKLDQTFPFSLMRRISVFKYVSGSSAERRAAQSRKIi 
^PVPGSYPTQSCHPHLSPSHPVSQTQHASGQKLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSIiGIQPPTP I 
/WSPGTVYSGALGTPGAAGSKESSRVPMP J 

3TPCPLAWERGSRSEDISVPGQKPPTCSSFSG^VGPSSLPHLG " 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine , G=Glycine, 
H»Histidine, I - Is o leucine, K»Lysine, 
L»Leucine, M^Methionine, N-Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
KGEPGIPAIPGIRGPKGQKGEPGLPGHPGKNGPMGPPGMPGVPG 
PMGIPGEPGBEGRYKQKFQSVFTVTRQTHQPPAPNSLIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTSKTNOVNSGGVLLRLQVGEEVWLAVNDYYDMVGIQG 
SDSVFSGFLLFPD 


5570 


264 


94 6 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPS PGKRRMDTD WKL 1 ESKHEVTI LGGLNEFWKFYGPQGT 
P YEGG VWKVRVDLPDK Y P F KS PS I G FMNKI FHPN I DEAS GT VCIi 
DVTNQTNTAL YDL TNI FES FL PQLLA YPNP I DPLNGD AAAM YLH 

RPEBYKQKIKBYIQKYATEBALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGG VATSTE EPAR PRAPQS RG PG PVSQTGRGRERGGGDT 
MSS PS PGKRRMDTD WKL I ESKHEVTI LGGLNE FWKFYGPQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPNIDEASGTVCL 
DVINQTWTAI»YDI>TNIFESFLPQLLAYPNPIDPLNGDAAAMYIjH 
RPEBYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 - 


2085 


RTDYRTGIPGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG 
DAGAAAEGGGWAAAALALLTGGGEMLLNVALVALVLU3AYRLWV 
RWGRRGXjGAGAGAGEESPATSLPRMKKRDFShEQhRQYDGSRtTP 
RILLAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGLATFCLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEXYDYVGRLLKPGE 
EPSEYTDEEDTKDH^KQD 


1 5*7* 


' 2*62 


219 


VPART PNAEDQGP EARAATAT PCQ S GG RE RAGEAAEDG VKMAAF 
S EMG VM P E IAQAVEEMDWLLPTD IQ AE S I PL I LGGGDVLMAAET 
GSGKTGAFSIPVIQIVYETLKDOX3EGKXGKTTIKTGASVLNKWQ 
MNP YDRGS AFA I GSDGLCCQ S RE VKE WHGCRATKGLMKGKH Y Y E 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNTKQFD 
NYGEEFTMHDT IGCYLDIDKGHVKFS KNGKDLGLAFEIPPHMKN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALS KAPDGY I VK 
S3HSGWAQVTQTKFLPNAPKALIVEPSRELAEQTLNNIK0FKKY 
IDNPKLRELLIIGGVAARDQLSVLENGVDIWGTPGRLDDLVST 
G KLNLS Q VRFL VLD EAJDGLLSQGYSDF I NRMHNQ IPQ VTS DG KR 
LQVI VCS ATLH S FDVKKLS E K IMHFPTWVDLKGE DS VPDTVHHV 
WPVNPKTDRLWERLGKSHIRTDDVHAKDNTRPGANSPEMWSEA 
I KILKGE YAVRAI KEHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFS CVCLHGDRKP HERKQNLER FKKGD VRFLI CTDVAA 
RGIDIHGVPYVINVTLPDEKQNYVHRIGRVGRAERMGLAISLVA 
T EKE KVW YHVCS S RGKGC YNTRLKEDGGCri WYNEMQLLS E I EE 
HLNCTISQVEPDIKVPVDEFDGKVTYGQKRAAGGGSYKGHVDIL 
APTVQELAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGLEVFKEQELQP EDKGAV PEDAS TERS AMAS LGLQLVG Y I LG 
LLGLLGTLVAMLLPSWKTS S YVGAS IVTAVGFS KGLWMECATHS 
TGITQCDI YSTLLGLPADIQAAQAMMVTSSAISSLACI IS WGM 
RCTVFCQE SRAKDRVAVAGG VFFI LGGLLGF I P VAWNLHG I LRD 
FYSPLVPDSMKFEIGEALYLGIISSLFSLIAGIILCFSCSCQRN 
RSNYYDAYQAQ PLATRSS PRPGQPPKVKSEFNS YSLTGYV 


5575 


456 


766 


LLWALPC P PPTAAAVLLS S TGLMELLEKMLALTLAKADS PRTAL 
LCSAWLLTAS FS AQQH KGS LQKDPLLS QAC VGCLEALLDYLDAR 
S PDIGRNS PHYLMFP 


5576 


249 


2146 


RS WGAP W F WRMRLLRRRHMP LRLAMVG CAF VLFLFLLttRD VS S R 
E EATEKP WLKSL VS RKOHVLDLMLEAMNNLRDS MPKLQIRAP EA 
QQTLFS I NQSCL PGFYT PAE LKPFWER P PQDPNAPG ADGKAFQK 
S KWTPLETQEKEEG YKKHCFNAFASDR I SLQRS LGPDTRPPECV 
DQXFRR CP PLATTS VI IVFHNEAWSTLLRTVYS VLHTTPAI LLK 
EI ILVDDASTEEHLKEKLEQYVKQLQWRWRQBERKGLITARL 
LGASVAQAE VLTFLDAHCE C FHGWLE PLLARI ABDKTVWS P D I 
VTIDLNTFEFAKP VQRGRVHSRGNFDWS LTFGWBTLPPHE KQ RR 
KDETYP Z KSPTFAGGLFS I SKS YFEHI GTYDNQMEI WGGENVEM 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

OlulUv dL J <jk 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, tf^Asparagine, 
P=»Proline, Q=Glutamine, R^Arginine, 
S-Serine, T=Threonine , V=Valine, 
W«Tryptophan, Y^Tyroslne, X^Untaiown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLEI I PCSWGHVFRTKSPHTFPKGTSVIARNQVR - 
LAEVWMDSYFCKIFYRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HNFS W YLHNVYPEM FV PDLTPTF YGAI KNLGTNQ CLDVGENNRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKGALG 
LGSCHFTGKNSQVPKDEEWELAQDQLIRNSGSGTCIiTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCS CGE I SVHCLPWVLF I LDLKVES SMFCPLKL I LLP VLLD 
YSLGLNDLKVSPPELTVHVGDSALMGCVFQSTEDKCI FKIDWTL 
S PGEHAKDE Y VL Y YYSNLS V P IGRFQNR VHLMGD I LCNDG S LLL 
QDVQEADQGTYZCBIRLKGESQVPKKAWLHVLPBEPKSlUmiV 
GG L I QMG C V FQSTE VKHVTKVEW I FSGRRAKEE I VFR Y YHKLRM 
S VE YSQS WGHFQNRVNLVGD I FRNDGS IMLQGVRESDGGNYTCS 
IHLGNLVFKKTIVLHVSPEEPRTLVTPAALRPLVLGGNQLVI IV 
G I VCAT I LLLPVL1 LI VKKTCGNKSSVNSTVLVKNTKKTNPE I K 
EKPCHFERCEGEKHIYSP1IVREVIEEEEPSEKSEATYMTMKPV 
WPSLRSDRKNSLEKKSGGGMPKTQQAF j 


5578 " 


3 


783 


AVESMAS PGAGRAPPELPERNCG YREVEYWDQRYQGAADSAPYD 
WFGDFSSFRALLEPELRPEDRILVLGCGNSALSYELFLGGFPNV 
TS VDYSS VVVAAMQARYAHVPQLRWETMD VRKLDFP SAS FOWL 
E KG TLDALLAGERDPWTVSSEG VHTVD OVLSE VSR VLV PGGRF I 
SMTS AAPHFRT RHYAQAYYGW SLRHAT YGSG FHFHLYLMHKGGK 
LSVAQtiALGAQILSPPRPPTSPCFLQDSDHEDFLSAIQL [ 


5579 


3 


1540 


RN SGLARG AS ALARHGGGLAGG VGWDCGACAS RGQG VMEGLLTR 
CRALPALATCSRQLSGYVPCRFHHCAPRRGRRLLLSRVFQPQNL 
REDRVLS LODKSDDLTCKSQRLMLQVGLI YPASPGCYHLLP YTV 
RAMEKLVRVIDQEMQAIGGQKVNMPSLSPAELWQATNRWDLMGK 
ELLRLRDRHG KS YCLG PTHEEAI TAL I AS QKKLS YKQL P FLL YQ 
VTRKFRDBPRPRFGLLRGREF YMKDMYTFDSS PEAAQQTYS LVC 
DAYCSLFNKLGIiPFVKVQADVGTlGGTVSHEFQLPVDlGEDRLA { 
X CPRCSFSANMETLDLSQMNCPACQGPLTKTKGIEVGHTF YLGT 
KYSS I FNAQFTNVCG KPTLAEKG C YGLGVTR I LAAAI E VLS TED 
CVRWPSLLAPYQACLI PPKKGSKEQAASEL IGQLYDHITEAVPQ 
LHGEVLLDDRTHLTIGNRLKDANKFGYPFVI IAGKRALEDPAHF 
EVWCQNTGBVAFLTKDGVMDLLTPVQTV | 


5580 


1681 


450 


adagtrcipgfwpsgagysapaqrgrrssgrMraaaapgltapH 
wrllqccel eagelgmavpaaamgps algq sgpg smap wc5 v5 s 

GPSRYVIJSMQELFRGHSKTREFLAHSAKVHSVAWSCIJGRRLASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
ASGDKTI R I WD VRTTKCIAT VNT KGENINI CWS PDGQT XAVGNK 
DDWTFI DAKTHRSKAEEQFKFEVNE I S WNNDNNMFFLTNGNGC 
INI hS YPELKP VQS INAHPSNCI C I KFDPMGKYFATGSADALVS 
LWDVDELVC VRCFSRLDWPVR TLS FS HD GKWLAS AS EDHF I DIA 
EVETGDKLWEVQCESPTFTVAWHPKRPLLAFACDDKDGKYDSSR 
EAGTVKLFGLPWDS ) 


5581 


54 ■ ' 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSL YPTNS PS YAPEFQFLHSA YATLLMKQAWFQNSSS CGTEG 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPN P YQTAMYPIRSA YPQQNL YAQGAYYTQ PVYAAQPH VIHH 
TTWQPNS I PS AI Y PAP VAAPRTNG VAMGMVAGTTMAMS AGTLL 
TTPQHTAIGAHPVSMPTYRADfiTPavcWDDMM 1 


5582 


5775 


2739 


I ITNNNNVI IPLVIAYHLSGSAQARGERSPAERLMERQKRKADI H 
EKGLQ F I QSTLPLKQ EE YEAFLLKLVQNLFAEGNDLFREKDYKQ 
ALVQ YMEGLNVADYAASDQ VAL PRELLCKLHVNRAAC Y FTMGL Y 
EKALEDSEKALGLDSESIRALFRKARALNELGRHKEAYECSSRC 
SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG [ 
TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPBLDTLLDSLSL* 
VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 
ASPGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycinf», 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SaSerine, T^Threonine , V=Valine, 
W^Tryptophan, Y=Tyrosine, X«=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\espossible nucleotide insertion) 








DSFGSTRGSLDKPDSFMEETNSQDHRJPPSGAQKPAPSPEPCMPtf - 

TALLIKNPLAATHEFKQACQI>CYPKTGPRAGDYTYREGLEHKCK 

RDI LLGRLRSSEDQTWKR IRPR PTICTS FVGS YYLCKDMINKQDC 

KYGDNCTFAYHQEE ID VWTEB R KGTLNRDLIjFDPLGG VKRGS LT 

IAKLLKEHQGI FTFLCEI CF0SKPRIISKGTKDSPS VCSNLAAK 

HSFYNNKCLVHIVRSTSLKYSKIRQFQEHFQFDVCRHEVRYGCL 

REDSCHFAHSFlELKVWljbQQYSGMTHEDIVQESKKYWQQMEAH 

AGKASSSMGAPRTHGPSTFDLQWKFVCGQCMRNGCWEPDKDLK 

YCS AKARHCWT KERRVLLVMS KAKRKWVS VRPLPS I RUFPQQYD 

LCIHAQNGRKCQYVGNCSFAHSPEERDMWTFMKENKILDMQQTY 

DMWLKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 

CGKNSNSKKQWQQHIQSEKHKEKVFT3DSDASGWAFRFPMGEFR 

LCDRI^KGKACPDGDKCRCAHGQEELNEWIfDRREVLKQKLAKAR 

KD MLLCPRDDD FGKYNFIiIjQEDGDLAGAT PE APAAAATATTGE 


5583 


3 


1265 


S S GCRQGRPGRSDRPRPPPRRH KM VKETRYYD1LGVK PS AS PEE 
I KKAYRKIiALKYHPDKNPDEGE kfkl isqaye vlsdpkkrdvyd 
QGGEQAI KEGGSGS PS FSS PMD I FDMFFGGGGRMARERRGKNW 
HQLS VTLEDIiYNGVTKKLALQKNVl C3KCEGVGGKKGSVE KCPL 
CKGRGMH1H1QQIGPGMVQQIQTVC1ECKGQGER1NPKDRCESC 
SGAKVIREKKI I E VH VEKGMKDGQKILFHGEGDQEPELEPGDVI 
IVLDQKDKSVFQRRGHDIjIMKMKIQLSEAI*CGFKKTIKTLDNRI 
I*V I TS KAGBV I KHGDLRC VRDEGM P I YKAPLE KG I L I IQFLVI F 
PEKHWLSLBKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


S584 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE"' 
IKKAYRKI*ALKYHPDKNPDEGEKFKI»ISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGS PSFSSPMD1 FDM FFt3GGGRMARERRGKNVV 
HQhS VThEDL YNG VTKKUAhQKNVl CEKCEGVGGXKGSVEKCPI* 
CKGRGMHIHIQQ I GPGMVQQI QTVC IECKGQGER I NPKDRCESC 
SGAJCV I R EKKI I E VHVEKGMXDGQK I L FHGEGDQ E PELE PGDVI 
IVLDQKDHSVFQRRGHDLIMKMKIQliSEALCGFKKTIKTLDNRI 
I*VI TS KAGEVI KHGDLRCVRDEGMP I YKAPLE KG I LIXQFLVI F 
PEKHWLSLEKLPQLEAIiLPPRQKVRI-TDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTAIiDtFLTNQFSEAIjSYIiKPRTKESM 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKEAQMLCQRHRRKS 
S VTDS FSSIiVKR PTI/3QFTEEE IHAE VCYAKCLLQRAALTFLQD 
ENMVS F I KGGI KVRNS YQT YKELDSLVQS SQYCKG2NHPHFEGG 
VKLGVGAF^7I/TIfSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRS VLCVMLLLCYHTFLTFVIX3TGNVNI EEAEKLDKP YLNR 
YPKGAI FZjFLAGRI E VI KGNI DAA I RR FEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKIiTDGILEIITX 
AEEKLEKGPENEYSVDDECLVKLLKGLCLKYX.GRVQEAEENFRS 
I S ANEKK I KYDH YL I PNAL LEIiAI*LLMEQDRN£EAI KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLBNSSRSMVSSVSL 


5586 
5587 


2619 " 
1768 


915 
148 


LPAGTPESSIiHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYATILEMQAMMTFDPQDlLLiAGN[4MKEAQMLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEErHAEVCYAKCLLQRAALTFLQD 
ENMVS FI KGG I KVRNS YQTYKELDS LVQS SQYCKGENHPHFEGG 

GHS FRS VLCVMLLLC YHTFLTFVLGTGNVN I EEAE KLLKP YLNR 
YPKGAI FLFLAGR I E VI KGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMS YFYADLLSKENCWSKATY I YMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RR YFSSNP ISLPVPAiEMMYI WNG YAVIGKQPKLTDGILEI I TK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYlilPNAHiELALLLMBQDRNEEAIKLLESAKQ 
NYKNYSMESRTHFR1QAATLQAKSSLENSSRSMVSSVSL 
SSAVPDGAVtixyvAVAVGGPPHSCRCRPCCLMAAIGVHLGCTSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, (^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q-Glutamine, R=Arginine, 
S»Serine, T-Threonine, V» Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








cvavykdgragwandagdrvtpawayseneeivglaakqS^i - 

RNISNTVMKVKQILGRSSSDPQAQKYIAESKCLVIBKNGKLRYE 
IDTGEETKFVNPEDVARIiIFSKMKETAHSVLGSDANDWITVPP 
DFGEKQKNAIX3EAARAAGFNVLRI/I HEPSAALLAYG IGQDS PTG 
KSN I LVFKLGGTS LSLSVMEVNSG I YRVLSTNTDDNIGGAHFTE 
TLAQ YLASEFQRS FKHDVRGNARAMMKLTNSAEVAKHSLSTLGS 
ANCFLDSLYEGQDFDCNVSRARFEIjLCSPLFNKCIEAIRGLLDQ 

ngftaddinkwl cggssr i pklqqli kdlfpavellns i p pde 
vipigaaieagiligkenllvedslmiecsardilvkgvdesga 
srft vli f psgt p l p arrqhtlqapg s i s s vclel y es dgxns ak 
eetkfaqwxqdldkkengiirdilavltmkrdgslhvtctdqet 
gkceais1eias 


5588 


3 


589 


TPPP PEQAMVAATVAAAWLLLWAAACAQQEQDF YDFKAVN I RGK 
L VS LEK YRGS VS L WNVAS ECG FTDQHYRALQQ LQRDLG PHHFN 
VLAFPCNQFGQQE PDSNKE I ES FARRTYS VSFPMFSKI AVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAMDPTVSVEEV 
RPQITALVRKLI LliKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAALiPGEEGDPTRGRSLGRASWESGS 
PRRPRS P FS SFLPRP I CLSLEARPCS 1 EDRRNWSIjIGRPGAPAS 
GbNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAAliGKALCAIiLL 
ATLGAAGQPLGGE S I CSARAPAKYS ITFTGKWSQTAFPKQYPL F 
RPPAQWSSIiLGAAHSSDYSMWRKNQYVSWGLRDFAERGEAWALM 
KEIEAAGEALQSVHAVFSAPAVPSGTGQTSAELEVQRRHSLVSF 
WRXVPSPDMFVGVDSLDLCDGDRWREQAAIiDLYPYDAGTDSGF 
TFSSPNFATIPQDTVTEITSSSPSHPANSFYYPRLKAIjPPIARV 
TLLRLRQSPRAFIPPAPVLPSRDNEIVDSASVPETPLDCEVSLW 
SSWGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEEAECVP 
DNCV 


5590 


72 


896 


LCSSGALRLLPAMVAWRSAFLVCLAiPS LATLVQR^SGDFDD FNL 
EDAVKETS S VKQPWDHTTTTTTNRPGTTRAPAKP PGSGLDLADA 
LDDQDDGRRKPGIGGRERWNHVTTTTKRPVTTRAPANTLGNDFD 
LADALDDRNDRCDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRYGSND DPG SGMVAE PGT I AG VASALAMAIj I GAVS S Y I S Y 
QQKKFCFSIQQGLNADYVKGENIiEAWCEEPQVKYSTLHTQSAE 
PPPPPEPARI 


5591 


68 ! 


14 94 


AGSSRKAAAERLLVSAGCRSIAGRASGVLLLPAELliPGEEEAMA ' 
LRVTRNSKINAENKAKINMAGAKRVPTAPAATSKPGLRPRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPILVDTASPSPMETSG 
CAPAEEDL CQAFSDVIZAVNDVDAEDOADPNLC3EYVKDIYAYh 
RQLEEEQAVRPKYLIiGREVTGNMRAILlDWLVQVQMKFRLLQET 
MYMTVSI IDRFMQNNCVPKKMLQIjVGVTAMFIASKYEEMYPPE I 
GDFAFVTDNT YTKHQ I RQMEMKI LRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
I LDNGE WTPTLQH YLS YTEESLL PVMQHLAKNAAMWQGLTKHM 
TVKNKYATS KHAK I STLPQLNS ALVQDIiAKAVAKV 


5592 


242 


924 


YGES KDWNQ KD LLS ALVL.TTVNCLPTP I MAKSAEVKLAI FGRAG ' 
VGKSALWRFLTKRFI WE YDPTLES rYRHQATI DDE WSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
I KKP KNVTL I L VGNKADLDHS RQ VS TEEGEKLATELACAFYECS 

i\*e>\m\ j> " 1 cuv-rct, V rLKrCKri v^Ljjxi KKKbb 1 i nVKyAlMK 

MLTKISS 


5593 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH " 

SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 

DVDLNHYRIGKlEGFEVLKlCVKTLCIiRQm,lKCIENLEBLQSLR 

BLDIiYDNQIJCKIENLEAiTELEILDISFNLLRNIEGVDJCIiTRLK 

KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAIENIDTLTNLB 

SLFLGKNKITKLQNLDALTNLTVLSMQSimi>rKrEGLQWr,VNLR 

ELYLS HNG I EVIEGLENNNKLTMLDI ASNRI KKIENI SHLTELQ 

EFWMNDNIiLESWSDLDEIiKGARSLETVYLERNPIiQKDPQYRRKV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L«* Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R»Arginine, 
S=Scrine, T«= Threonine , V=Valine, 
W=Tryptophan, Y«=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLAL PSVRQ I DATFVR F 


5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEESHELPVDMETINLDRDAE 
DVDLNHYRIG KI EGFE VLKKVKTIiCIiRQNLI KC I ENLEELQSLR 
ELDLYDNQIKKlENLEALTELEILDISFNLLRNIEGVDKLTRLiK 
KLFLVNNKISKIENLSWLHQLQMLELGSNRIRAISNIDTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKI EGLQNLVNLR 
ELYLSHNG I EV I EGLENNNKLTMLD I ASNRI KKI 3NISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLBRNPLQKDPQYRRKV 
MLALPSVRQIDATFVRP 


5595 


3 


1476 


ARWNGRW VQ V PAW PG PG CGTNASGERQRQL PRAWR PVGRTLGSE 
PIAIAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LG I PT VPGKVTLQKDAQNL I GI S I GGGAQ YCPCLYI VQ VFDNTP 
AALDG TVAAGDE I TGVNGRS I KGKTKVE VAKM I QE VKGE VT I KY 
NKLQADP KQGMSLD I VLKKVKHRLVENMSSGTADALGLSRAILC 
NDGLVKRLEELERTAELYKGOTEHTKNLLRAFYELSQTHRAFGD 
VFSVIGVREPQPAASEAFVKFADAHRSIEKFGIRLLKTIKPMLT 
DLNTYLNKAI PDTRLTI KKYLDVKFEYTjSYCLKVKEMDDEEYSC 
IALGEPLYRVSTGNYEYRLrEjRCRQEARARFSQMRKDVLEKMSL 
LDQKHVQDI VFOLQRLVSTMSKYYNDCYAVLRDADVFP I EVDLA 
HTTLAYGUS1QEEFTDGEEEEEEEDTAAGBPSRDTRGAAGPLDKG 
GSWCDS 


5596 


698 


219 


GAVLAPSSLPAAELAAQGESQSLBDLSNTSRPTSEVYKISFIFP 
NGDKYDGDCTR'tS SGIYERNGIG IHTTPNG I VYTGSWKDDKMNG 
FGRL EH FSGAVYEGQF KDNM FHGLGT YTFPNGAKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 


731 


I SCKMAADGQS SLPASWRSVTLTHVE YPAGDLSGHLLAYI>SI*S P 
VPVIVGFVTHIFKRELHTISFLGGIALNEGVNWLIKNVIQEPR 
PCGGPHTAVGTKYGMPSSHSQFMWFFSVYSFLFLYIjRMHQTNNA 
RFLDLLWRHVLSLGLLAVAFLVS YS RVYLLYHTWS QVLYGG I AG 
GLMAIAWFIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKLQ 


5598 


326 


2440 


GIGP IAAS FIFCKVASJLYIFIiS PPPPS VSGVPYS PANSS WS CAI* 
VPLLGSGVPPHPPAPS PCCSGQTMLKMIjS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHSQSLFHS PEREVLERDLVLPLLCKD YCKEFFYTCR 
GRI PG FLQTTADE FCFYYAR KDGGLCFPDF PRKQVRGPASNYLD 
QMEEYDKVEE1SRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
IiFIIjEKEGYVXILTPEGEIFKEPYIiDIHlCLVQSGIKGGDERGLIi 
SLAFHPNYKKWGKLYVSYTTNQERWAIGPHDHILRVVEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLIiFGPDGFLYI I LGDGM 
ITLDDMEEMDGIiSDFTGS VLRLDVDTDMCNVPYS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQI1KGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGS YVFGDRNGNFLTLQQS pvtkq wqbkplclgtsgscrg YFSG 
h i lgfgedelgbvyi lsss ksmtqthngkl yki vdpkrplmpee 
cratvqpaqtltsecsrlcrngyctptgkcccspgwegdfcrtg 


5599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGV P PHP PAPS PCCSGQTMLKM LS FKLLLLAVALGFFEG 
DAKFGE RNEGSGARRRRCLNGN P P KRLKRRDRRMMS QLELLSGG 
EMLCGGFYPRI^CCLRSDSPGLGRDEWKIFSVTNNTECGKLLEE 
I XCALCSPHSQSLFHS PERE VLERDLVLPLLCKDYCKEFF YTCR 
GHIP3FLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEE YDKVEE ISRKHKHNCFCI QE VVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
S LAFHPNYKKNGKLYVS YTTNQERWAIGPHDHILRWE YTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYIILGDGM 
ITLDDPIEEMDGLSDFTCSVLRIjDVDTDMCNVPYSIPRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N»Asparagine , 
P=Proline, Q*Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSBRL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDEIX3EVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


S LR VLSGHLMQTRDL VQPD KPAS PKF I VTLDG VP S P PG YMS DQE 
EDMCFEGMKPVNQTAASWKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5601 


1977 


1244 


SLR VLSGHLMQTRDLVQPD KPAS PKF I VTLDG VPS PPG YMS DQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYHPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5602 


246 


766 


YHTSCTVWRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRHAEMVA IDQVLDWCRQSGKS PSE VFEHTVLYVTVEPCI MC 
AAALRLM K I PL WYGCQN E R FGG CGS VI^NI AS ADL PNTGR P FQC 
I PG YRAE EAVEML KTFYKQENPNAPKS KVRKXE CQQ I LNMF 


5603 


1 


565 


FRGRT PI SGGERGCAQYP I PATPARSGENRTM PGAGDGGKAPAR 
WU3TGLLGLFLLP VTLSLEVS VGKATD I YAVNGTE I LLPCTFSS 
CFGFBDLHFRWTYNSSDAFK I LI EGTVKNEKSDP KVTLKDDDR I 
TLVGSTKEKRNN 1 S I VLRDLE FSDTGKYTCHVKNPKENNLQHHA 
T I FLQ WDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT 
DQGQVALGGH YMAEGEG Y FAMS E DELAC S P YI P LGG D FGGGD FG 
GGDFGGGDFGGGDFGGGGSFGGHCLDYCESPTAHCNVLNWEQVQ 
RLDGILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKRIGVR 
DVRLNGSAASHVLHQDSGLGYKDLDLIFCADLRGEGEFQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
3 LS NNSG KNVEL KF VDS LRRQFE FSVDS FQ I KLOS LLL F YECS E 
NPMTET FHP T I IG ES VYGDFQEAFDHLCWKI I ATRN PEE I RGGG 
LLKYCNLLVRGFR PAS DE I KTLQRYMCSRFF I DFSD IGEQQRKL 
E S YLQNH FVG LEDRX YE YLMTLHG WNE S T VC LMGHE RRQTLNL 
ITMLAIRVLADQNVI PNVANVTC YYQ PAP YVADANFSN YYIAQV 
QPVFTCQQQTYSTWLPCN 


S605 


35 


1821 


SQRSCPRS PS SPAP P WARCSNPDS RTGGVPVPRAWSAGGPALGL " 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRRYPLPLRSGKEAKILQHFGDGLCRMLDERLQRHRTSG 
GDHAPDS PSGENSPAPQGRLAEVQDSSMP VPAQPKAGGS GS YWP 
ARHSGARVI LLVLYREHLNPNGHHFLTKEELLQRCAQKS PRVAP 
GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGLELAQKLAESE 
GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLELRP 
GEYRVLLCVDIGETRGGGHRPELLRELQRLHVTHTVRKLHVGDF 
WWAQETNPRDPANPGELVLDHIVERKRLDDLCSSIIDGRFREQ 
KFRLXRCGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 
FFVKRTAD I KES AAYLALLTRGLQRL YQGHTLRSRPWGTPGNPE 
SGAMT3PNPLCSLLTFS DFNAGAI KNKAQS VREVFARQLMQVRG 
VSGEKAAALVDRYSTPAS LLAAYDACATPKEQETLLSTI KCGRL 
QRNLGPALSRTLSQLYCS YGPLT 


5606 


3 


1099 


GRS RC PGPGARGGTMS P RS CLR S LRLL VFAVFS AAASNWLYLAK 
LSSVGSISEBETCEKLKG.LIQRQVQMCKRNLEVMDSVRRGAQLA 
IEECQYQFRNRRWNCSTLDSLPVFGKWTQGTREAAFVYAISSA 
GVAFAVTRACSSGELEKCGCDRTVHGVS PQGFQWSGCS DNIAYG 
VAFSQSFVDVRERSKGASSSRALMNLHNNEAGRKAILTHMRVEC 
KCHGVSGS CE VKTCWRAVPPFRQVGHALKEKFDGATE VE PRR VG 
SSRALVPRNAQFKPHTDEDLVYLEPSPDFCEQDMRSGVLGTRGR 
TCNKTS KAIDG CELLCCGRGFHTAQVELAERCS CKFHW CC FVKC 
RQCQRLVELHTCR 
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ID 
NO: 


beginning 
nucleotide 

1 location 

corresponding 
to first 
amino acid 
residue of 

j amino acid 

j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C«Cysteine, D=*Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P« Proline, Q=Glut amine, R«Arginine, 
S<=Serine, T=Threonine, V«Valine, 
V7»Tryptophan, Y«Tyrosine, X«UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5507 


! 521 


141 


PPVCNPAEAMPSPGTVCSIiLIiLC^MT.WLDLAMAGSSFLSPEHQRV 
QQRKES K KP PAKLQPRALAGWLRPEDGGQAJ£(3AEDELE VRFNAP 
FD VG I KLS G VQ YQQHSQALG KFLQD I LW EEAKEAPAD K 


5608 


j 2 


983 


W FQS PLRQADPG PPRHTL FMD F VAGAIGG VCGDAVG Y P L DTVKV 
RIQTEPKYTGIWHCVRDTYHRERVWGFYRGLLLPVCTVSLVSSE 
VFG T YRHCLAH I CRLR FGNPDAKP TKADI TIjSGCAS GL VR VFLT 
S P TE VAKVRLQ TQTQAQKQQRRI>S ASGP1AVPPMCP VP PACP B P 
KYRG P LHCLATVAREEGLCGI/YKG S SALVLR DGH S FATYFLS YA 
VLCE WLS PAGHS R PD VPG VL VAGGCAGVLAWAVATPMDVI KSRL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 
MWFVAYEAVLRLARGIjLT 


S609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KR IREAKRSARPELKDSLD WTRHNY YESFSLS paavadnverad 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCGEDNDGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 
EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGI H I DPLGTSAWNALVQGHKR WCLFPTSTPRELIKVTRDEGG 
NQQDEAI TWFNV I YPRTQLPTW P PEF KPLE I LQKPGETVFVPGG 
WWHWrjNLDTTIAlTQNFASSTNFPVVWHKTVRGRPKLSRKWYR 
ILKQEHPELAVIiADSVDIiOESTGIASDSSSDSSSSSSSSSSDSD 

SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 
R 


5610 


1 S4 


1196 


LERTPASADMAW TKYQL FlaAGLMLVTG^ £ NTI£ AKWADN FMAEG 
CGG S KEHS FQH P FLQAVGMFLGB FSCLAAF YLLRCRAAGQS DSS 
VDPQQPFNPLLFLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 
VI I FTGLFS VAFLGRRLVLSQWLGI LATI AGLVVVGLADLLS KH 
DSQHKLSEVITGDLLI IMAQI I VAIQMVLEEKFVYKHNVHPLRA 
VGTEGLFGFVI LSIiLIiVPM YYI PAGS FSGNPRGTliEDALDAFCQ 
VGQQPL I AVALLGNI SS I AFFNFAG I S VTKELS ATTRMVLDSLR 
TWIWAIiSI^GWEAFHALQILGFLILLIGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 


577 


FVL PNRLGI PGS TFRGPGACASSSSLAASAKPGAGGS PALAMSG 
ELSNRFQGGKAFGliliKARQERRlAEINREFLCDQKYSDEENLPE 
KLTAFKE KYME FDLNNEGE IDLMS LKRMMEKLGVPKTHLEMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVTiKLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPKMPOYGEENHIFELKQAMWLCKHLNS 
S LLTLENLI LNE FS YTATEARRLYLQRKT VPSALLVQL I QERLA 
EEDCI KQGWILDGI PETREQALRIQTLG ITPRHVI VLSAPDTVL 
IERNLGKRIDPQTGEIYHTTFDVTPPESEIQI^RLMVPEDISEIjET 
AQKLLEYHRNIVRVIPSYPKILKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVIjLLGPVGS 


5613 


115 


1279 | 


RGVDPALRRAEKMIiPLSIKDDEYKPPKFNLFGKISGWFRSILSD 

ktsrklffflclnlsfafvellygiwsnclglisdsfhmffdst 
ai laglaas viskpjrdndafs yg yvrae vlagfvnglfli ftaf 

F I FSEGVERALAP PDVHHERLLLVSILGFWNLIG 1 FVFKHGGH 
GHSHGSGHGHSHSLFNGATiDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHS HDG PSLKE TTGPS RQI LQG VFLHI LADTLGS IGVI 
AS AIKMQNFGLM 1 ADP I CS I L I AI L IWSVI PLLRE SVG I LMQR 
TP P Uh BNS LPQC YQR VQQLQG V YS LQEQHF WTLCSD VYVGTLKL 
I VAPDADARWILSQTHNI FTQAGVRQL YVQ IDFAAM 


5614 


3 


1268 

< 


LLSRNEHACPUaAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
APWGARQRLG VMAELQQLQEFE I PTGREALRGNHSAIiLRVAD YC 
EDNYVQATDKRKAIiEE TMA FTTQALAS VAYQ VGNLAGHTLRMLD 
LQGAALRQVE ARVS TLGQMVNMHMEKVARRE IGT1ATVQRL P PG 
QKVI A PENL P PLTP YCRRPLNFGCLDD I GHG I KDLSTQLSRTGT 
LS RKS I KAP AT PAS ATLGRP P R I PE P VHLP WPDGRI,S AAS S AS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
2RPPTLEELSPPPPDEELPLPI»DLPPPPPLDGDEIjGLPPPPPGF 
SPDEPSWVPASYI^KVVTLYPYTSQKI^ELSFSEGTVICVTRRY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
Ha-Histidine, I=sIsoleucine , K= Lysine, 
L=I»eucine, M=Methionine, N=Asparagine, 
P-Proline, Q»Glutaraine, R=Arginine, 
S*Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *cStop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 
SDGWCEGVSS EGTGFFPGNYVEPSC 


5615 


9 


1558 


ALGRRRPGDPREMBAAATPAAAGAARREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGI S FVQTLMHI*LKGNIGTG 
LLGLPLA I KNAG I VU3 PISLVFIGI IS VHCMHILVRCSHFLCLR 
FKKSTLGY S DTVS FAME VS PWS CLQKQAAWGRS WDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LR I YMLC FL P FI I LIj VF I RELKNL FVLS FIiANVSMAVS L VT I YQ 
YWRNKPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKESKRFPQALNIGMGIVTTLYVTLATLGYMCFHDEIKGSITLN 
LPQDVWLYQS VKIL YS FGI FVTYS IQF YVPAE II I PG I TSKFHT 
KMKQICEFGIRSFLVS I TCAGAILI PRLDI VI SFVGAVSSSTLA 
L I L ? P LVE I L»T FSKEH YNI WMVLKN I S I AFTG WG FLLGT Y ITV 
EE 1 1 YPTP KWAGTPQS PFLNLNSTCLTSGLK 


5S16 


1 


719 


DDFVRCGPQSAAMGASARIiLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLS SGDIjLRDNMLRGTEIGVIAKAFI DQGKLI PDDVMTRIjAL 
HELKNLTQYSWLIiDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 
IKQRLTARWIHPASGRVYNIEFNPPKTVGIDDLTGEPLIQREDD 
KPETVIKRLKAYEDQTXPVLEYYQKKGVLETFSGTETNKIWPYV 

yaflqtkvpqrsqkasvtp 


5617 


176' 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAE TTVSSVG 
TCEAAGKSPEPKDYDSTCVFCRIAGRQDPGTELLHCENEDIjICF 
KDI K PAATHH YL WPKKH 1 GNCRTLRKDQVELVENMVT VG KT I L 
ERNNFTDFTNVRMGFHWPP FCS I SHLHLHVLAPVDQLGFLS KLV 
YRVNSYWFITADHLIBKIiRT | 


5618 

• 


3 


1692 


YLNYINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKSIRLLSBIEKLVGTSVPGLLBIILSSSILEIYN 
HILQTWPDEDVTFRKSCATKRKI.SNINQEEASGTSLHQKAI.MT 
FTCHNE INAF WliSRGS Q I LSIiNS TRFLTKLGH CS S AC PSDS VS 
QTNIQNLKGLNSPVLIGKSKDPSCVAKVSEEGKPAIGTQKMELH 
VRWRSDTGKCVDASPLWIPTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVK WEQ I IiGDR I ES S ACVS KCGNF I VVGC YNGL VYVLKS NSG 
EK YWM FTTEDAVKSS ATMDPTTGIj I Y I GS HDQHAYALDI YRKKC 
VWKSKCGGTVFSSPCXNLIPHHLYFATLGGLLLAVNPATGNVIW 
KHSCGKPLFSS PQCCSQYI CIGCVDGNUjCFTHFGEQVWQFSTS 
GPIFSSPCTSPSEQKIFFGSHDCFIYCCNMKGHLQWKFETTSRV 
YATPFAFHNYNGSNEMLLAAASTDGKVWILESQSGQLQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 

— S^26 


2160 


1477 


DS P VLPTSGNV I STAQPAQ P WS AVEAAliRSliGS P PGAGRG CPCP 
AQSLHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 
RSCPQ PR P LE ELLRAGS STR PQPLTSS CCGMS CM YS FLGHCS VL 
LWGTKGRGSGSPSSPGCCLHPPAQHSQDLPLVHVDVGWQFPLGP 

TVGLRPGLU3ERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 




930 


182 


PLPP PTLAMFLTRSE YDRGVNTFSPEGRLFQVE YAI EAI KLGST • 
AIG IQTS EG VCI*AVEKR ITS PLMEPS S I EKI VE I DAH IGCAMSG 
IjIADAKTLIDKARVETQNHWFTYNETMTVESVTQAVSNIiALQFG 
EEDADPGAMSRP FGVALL FGGVDEKGPQLFHMDPSGTFVQCDAR 
AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVI4EEKLNA 
TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


WE FVE YTATDAN VKNESJjS S VQQLG I KMTVRYG KFIjS LLKDGA 
crujui wvJjR-HC£KFLiKQQQTS IKSSLLCLQGNYAGHDWFVSSLF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
GIHPVYFCSTHY1EMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 
HTQTQDLQVFLKEEALHGFRVSDYPEYME I IjEQN YRT VLLRDMR 
MIRLQST 


5622 


1122 


456 

< 

< 


aastkdavsrkrshsaseksgtgtsiskrlnmnpqirnpmkamy 
pgtfyfqfknlweandrnetwlcftvegikrrswswktgvfrn 
3 vds ethchaercfls w fcddi hs pntkyqvtw yts ws pcpdca 
sevaeflarhsnvnltiftarlyyfqypcyqeglrslsqegvav 
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SEQ 

I 10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=»Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








E I MD YED F KYC WEN F VYNDNEP FK P W KGLKTN FRLLKRRLRESL" 
Q 


5523 


3 


954 


FLPFFIRAPKISRNGQWLFTFTTPFPFANKALPGWEGIVPACFW 
RKKI LT PS TGTME LLQVT I LFLLP S I CSSNS TGVLEAANNS LW 
TTTKPS I TTPNTESLQ KNWTPTTGTT PKGT I TNELLKMS LMS T 
ATFLTSKDEGLKATTTDVRKNDS 1 1 SNVTVTS VTL.PNAVSTLQS 
S K P KTETQ S S I KTTE IPGS VLQPDAS PSKTGTLTS I P VT I PENT 

SQSQVIGTEGGXNASTSATSRSYSSIILPVVIALIVITLSVFVL 

VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5624 


159 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
SSGSRKLYFDTHALVCLIiEDNGFATQQAEI I VSALVKI LEANMD 
I VYKDMVTKMQQE ITFQQVMSQI ANVKKDM I ILEXSEFSALRAE 
NEKI KLELHQLKQQVMDEVI KVRTDTKLDFNLEKSRVKBL YSLN 
EKKLLELRTE I VALHAQQDRALTQTDR KI ETE VAGL KTMI.F <3 T -i V 
LDNIKYLAGS I FTCLTVALG F YRLW I 


5625 


1 


1180 


TI PS S AAA^KAG P PAGALE ALS PGGARAHABRRGEMRATPliAAP 
AGSLSRXKKLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTBYTCR 
VYPVQEAIAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHSLVRSRHR I PEPEAAVLFRQMATALAHCHQHGLVLiRDLKL 
CR FVFADRERKKLVL.ENLEDS CVLTGPDDSLWDKHACPAYVGPE 
ILSSRASYSGKAADVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRG A Y ALP AGLS APARCLVRCLLRR E PAERLTATf3 T T t.UDMt tdo 

DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 


3123 


2011 


k> fKAlAJS VAMENQ VLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLiVKPE PVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGS PETLTNLRKGYLFMYNLVQFLGFSW 1 FVN 
LTVR FC I LG KES FYDTFHTVADMM YFCQMLAVVE T I NAAIG VTT 
S P VL PS 1* I QLLGRN F I LFI 1 FGTMEEMQNKAWFFVFYLWSAIE 
I PR YS F YM LTC I DMD WKVLTWLR YTLW I PL YPLG CLAEAVS VI Q 

SIPIFNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5627 


3123 


2011 


PPRALGSVAMEKQVliTPtfVYWAQRHRELYLRVELSDVQNPAISI - 
TENVLHF KAQGHGAKGDNVYE FHLEFLDLVKPEP VYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERIJSKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVRFCILGKESFYDTFHTVADMMYFCQMLAWETINAAIGVTT 
S P VLPS L I QLLGRNF I L F 1 1 FGTMEEMQNKA WFF VF YLWSLA I E 
I FRYSF YMLTCI DMD WKVLTWLR YTLWI PLYPLGCLAEAVSVIQ 

SIPIFNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5628 




1455 


VAGAMASKCLKAGFSSGSLKS PGGASGGSTRVSAMYSSS PCKLP 
SLSPVARS FSACS VGLGRSS YRATS CLPALCLPAGGFATS YSGG 
GGWFGEG I LTGNEKETMQSLNDRLAGYLEKVRQLEQENAS LESR 
IRE WCEQQ VP YMCPD YQS YFRT I E ELQKKTLCS KAENARL WE I 
DNAKLAADDFRTKYE TEVS LRQLVESD INGLRR I LDDLTLCKSD 
LEAQVESLKBELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 
DIjlfRVIiEEMRCQYETLVENNRRDAEDWLDTQSEELNQQVVSSSE 
QLQS CQAE 1 1 ELRRTVNALEIELQAQHSMRDALESTLAETE AR Y 
SSQLAQKQCM I TNVEAQLAEIRADLERQNQE YQVLLDVRARLEC a 

BINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCS ARP I CVPCPGGRF 


5629 


2287 


938 ( 

] 
< 
I 
I 


3RPRS S SDNRNFLRERAGLS S AAVQTR IGN SAAS RRS PAAR P PV 
PAPPALPRGRPGTEGSTSLSAPAVLWAVAVVVVVVSAVAWAMA 
^YIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
2EVTLQLFTDG I TNKLIGCYVGNTMED WLVRI YGNKTELLVDR 
3EEVKSFRVI^AHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
^AIFRLIARQLAKIHAIHAHNGWIPKSNLWLKMGKYFSLIPTGF 
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1 qprV" 

ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
Hs=Histidine, I=»Isoleucine, K=Lysine, 
L=Leucine , M^Methionine , N=Asparagine , 
P» Proline, Q=»Glutamine, R«=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDINKKFLSDIPSSQILQEEMTWE4KEILSNl^SPVVLCHNDlH 
LCXNI I YNE KQGD VQ FI D YE YS G YN YLAYD IGNHFNE FAG VSDV 
DYSL YPDRELQSQWLRAYLEAYKEFKGFGTE VTEKE VE I LFIQV 
NQFALASHF FWGLWAL I QAK YS TI E FD FIX3 YAI VRFNQ YFKMK P 
EVTAIiKVPE 


5630 


1194 


278 


GFWAIAQTCAHHLPPGSPWLVPASPWRLPEMSSFGYRTLTVALFI 
TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKI LLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
KSNVS VYQP PRQVTLTLQ PTLVAVGKS FTI ECRVPTVEPLDS LT 
LFLFRGNE TLRTETFGKAA PAPQEATATFNS TADREDGHRNFS C 
1AVLDLMSRGGN I FHKHSAPKMLE I YE PVSDSQMV 1 1 VTVVSVL 
LS L FVTS VLLC F I FGQHLRQQRMGTYG VRAAWRRL P QAFR P 1 


5631 


1053 


290 


SRVDDF^PEPSRAEPSRSGRRRPARRAATMSVFGkLFGAGGGlH 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 
TKNKRAALQAL KRKKR YE KQLAQ I DGTLST I E FQREALENANTN 
TE VLKNMG YAAKAMKAAHDNMDI DKVDELMQD I ADQQELAEE XS 
TAISKPVGFGEEFDEDEIiMAELEELEQEELDKNLLEISGPETVP 
LPNVPSIALPSKPAKKKEEEDDDMKEIjENWAGSM 1 


5632 
5633 


3 


952 


WLGWSPPRRLWWGSLGAAQRPAVPVSGLARSLHVETRRPHRRA""] 
SVRVARGRLGVWAQPQPIiLPRPVGSRREMQPPGPPPAYAPTNGD 
FTFVS SADABDLSGS I ASPDVKLNLGGD FI KESTATTFLRQRG Y 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYKIRCVLMPMPSLGF 
NRQWRDNPDFWGPLAWLFFSMISIiYGQFRWSMIITlWlFGS 
LT I FLLAR VLGGE VA YGQVLG V IG YS LL PLI VIAP VLL WGS FE 
| WS TL I KLFGVF WAAYS AAS LL VGEE FKTKKPIjIi I YP I FLL Y I Y 
FLSLYTGV | 




771 


460 


QGCSKTMSVGRPFYRSSEFMEQLLSSHliHQ\^FFCCFTWCl^Nn 

CLFBNSVSKLYMLCFNFFMSIFFYSLSITKLNL1YLWGLSYQSL 
LLLLLSGHRPWGSSMV 1 


5634 


1446 


855 


pratgrirsraaasrpragagasgaeprsgrersrlsgrrapamH 

ARNTLS SRFRRVD IDEFDENKFVDEQBEAAAAAAEPGPDPSEVD 
GLLRQGDMLRAFHAALRNSPVNTKNQAVKERAQGVVIjKVIjTNFK 

sseieqavqsldrngvdjulmkyiykgfekptenssavllqwhek 
alavgglgsiirvltarktv 1 


5635 
5636 


3 


• 943 | 


DRGPRSTAl-DTGRARVSFWRFPLDPGVK^SNVQlSGEKRRFRTXn 

RSLFHPFPVTRSGAPRAVLVGSSWPAKMVAPAVKVARGWSGIAL 

GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 

EKYVRELKKTQLIKAAPAGKTSSVFEDPVISKFTNMMMIGGNKV 

IiARSLMlQTLEAVKRKQFEKYHAASAEEQATIERNPYTIFHQAIi 

KNCEPM I G L VP I LKGGRF VQVPVPL PDRRRRFLAMKWMI TECRD 

KKHQRTLMPEKLSHKLLEAFHNQGPVrKRKHDLHKMAEANRAIA 
HYRWW | 


5637 


2253 


1143 j 


ledticqhppaekklylyhrklreverKgiprlpkdvfmdthqg 

LTDVRAKVTGFSEGWDS VKGGFS S FSQATHSAAGAWSKPRBI 
ASlilRNKFGSADNIPNLKDSLEEGQVDDAGKALGVISNFQSS PK 
YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTLDMQSSGFD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEEQLNDLTELHQNEILNLKQELASMEEKIAYQSYERARDI 
QEALEACQTRISKMELQQQQQQWQLEGLENATARNLLGKLINI 

LLAVMAVLLVFVSTVANCWPLMKTRNRTFSTLFLWFIAFLWK 
HWDALFSYVERFFSSPR 




948 


2532 J" 
1 

i 
] 

: 

j 3 


MSFCGARANAKMMAAYNGGTSAAAAGHHHHHHHHLPHLPPPHLlH 
HHHHPQHHLHPGSAAAVHPVQOHTSSAAAAAAAAAAAAAMLNPG 
QQQPYFPS PAPGQAPGP AAAAPAQVQAAAAAT VKAHHHQHSHH P 
2 Q 0L D I E PDRP IG YG AFG WWS VTD P RDGKR VALKKM PNVFQNL 
^SCKRVFRELKMLCFFICHDNVLSALDILQPPHIDYFEEIYWTE 
liMQSDLHKI IVSPQPLSSDHVKVFLYQILRGLKYLHSAGILHRD 
C KPGNLDVNSNCVLKI CD FGLARVEELDES RHMTQE WTQ YYRA 
3 EILI4GSRHYSNAIDIWSVGCIFAEIjI*GRRILFQAQSP1QQI,DL 
CTDLLGTPSLEAMRTACEGAKAHILRGPHKQPSLPVLYTLSSQA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D»Aepartic Acid, E« 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L-Leucine, MuMethionine, N«Asparagine, 
P=Proline, Q=Glutamine, R<=Arginine, 
S=Se rine, T=Threonine, v»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHLLCRMLVFDPYKRISAKDAIiAHPYLDEGRLRYHTCMCK " 
CCFS TSTGRVYTS DFE PVTNPK FDD T FE KNLS S VRQ VKE 1 1 HQ P 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 


DRKMSELDQLRQEAEQLKNQXRDARKACADATLSQITNNIDPVG 
RI QMRTRRTL RGHLAK I YAMHWG TD S RLLVS AS QDGKL 1 1 WDS Y 
TTNKVHAI PLR£S WVMTCAYAPSGNYVACGGLDNI CS I YNLKTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEIi 
KTY5HDNI I CGI TS VS FS KSGRLLIaAG YDDFNCNVWDAL KADRA 
GVLAGHDNRVSCXiGVTDDGMAVATGSWDSFLKIWN 


5639 


125. 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
R I QMRTRRTLRGHLAKI YAMHWGTDSRLLVS AS QDGKL 1 I WDSY 
T TNKVHAI PLRS SWVMTCA YAPSGN Y VACGG LDN I CS I YNLKTR 
EGNVRVSRELAGHTG YLS CCRFLDDNQ I VTSSGDTTCALWDI ET 
GQQTTTFTGHTGDVMSLSliAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNIICGITSVSFSKSGRIiLLAGYDDFNCNVWDALKADRA 
GVLAGHDNRVS CLGVTDDGMAVATGSWDS FLKIWN 


5640 


280 


1092 


QQGNKKTMLSHNTMMKQRKQQATAIMK2VHGNDVDGMDLGKKVS 
IPRDIMI*EELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHS I AMQNGKVDGSNLEGGSQQAPLTP PNTPDPRS PPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLBALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELH, 

LTDPRFMS FVNPLSGRRS FNRTPKGV? I S ENI PI VITTEPTDDTT 
VPESEDIi 


5641 


27 


332 


CKHNCNGDVKLLSNQMDKLFAFRLFTFHGLLHFLDGSiQKLIQA 
EIILSDNSSIIiVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 


ITPCRMDFIjVIjFLFYLASVLMGLVLICVCSKTHSLKGLARGGAQ 
I FS C 1 1 PECLQRAMHGL LH YIj FHTRNHT FI VLHL VLQGM VYTE Y 
TWEVFGYCQELELSLHYLLLPYLLLGVNLFFFTLTCGTNPGIIT 
KAKEI*LFLHNTYEFDEVMFPKN\mCSTCDIiRK3>ARSKHCSVCllWC 
VHRFDHHCVWVNNCIGAWNIRYFLIYVLTLTASAATVAIVSTTF 
LVHL WMS DL YQET YI DDLGH LHVMDT VFLIQYLFLTFPR I VFM 
LGFVWLS FLLGG YLLFVLY LAATOQTTNEWYRGD WAWCQRC P L 
VAWPPSAEPQVHRNIHSHGLRSNLQBIFLPAFPCHERKKQE 




5643 


1 


847 


PSGGVRDVETRGPGSRAARGPRWMKRRGVGAGAIAKKKLAEAK 
YKERGTVIiAEDQIiAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
RVQ FQDMCATIG VD PLASGKG FWSEMLGVGDFYYELGVQI I EVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 
ALGTGFGI IPVGGTYLIQSVPAELNMDHTWLQLAEKNGYVTVS 
EIKASLKWETERARQVLEHLLKEGLAWLDLQAPGEAHYWLPALF 
TDLYSQE I TAEEAREALP 




5644 


83 


113 8 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVS P VHL KILLTS DEAWKR FVRVAE L P R EEADAL YE ALKNLT P 
YVAIED KDMQQKEQQ FREW FLK E FPQ I RWK IQES IERLRVIANE 
I EKVHRGCVI ANWSGSTGI LS VIGVMLAPFTAGLSLS ITAAGV 
GLGIASATAGIASS I VENTYTRSAELTASRLTATSTDQLEALRD 
I LHDI TPNVLSFALDFDEATKM I ANDVHTLRRSKATVGRPIiIAW 

T? V\/DT WITT/CP r DTDflJi oinn tt re in f> oxtt ■ — 

x.xvr ±e* vvfii UrC X KviArTKi. VKKVAkN1jGKATSGVIiVVTjDVVNX» 

VQDSLDLHKG E KS ES AE LLRQWAQELEENLNELTH I HQSLKAG 




5645 


537 


799 " ~ 


VQS VRDLKR LSPTDP PGDSGNRD VTRED PVTGPLNSAS SQVPTL 
YLCLQNS LLGHS S VEDARATMEL YQ I SQR I RARRGLPRLAVSD 




564 6 ; 


3745 


3328 


AEQYGTSPHXLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLIiRAS I PATKRAS FLS S FI XMFFBELE YILGF 
LSLLKFHVHVS VYSAI CHFQKEGTGNSRS FTCTPELFPRLQTHL 
RABGGAQ 




5647 


288 


800 

J 


3VIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
SEDTQRHET YHQQGQCQVLVQRS PWLMMRMG ILGRGLQE YQL P Y 
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SEQ 
ID 
NO : 


predicted 
beginning 

XX lit. JL cOkluc 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
I residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine. D^Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








QRVDPLPIFTPAKMGATKEEREDTPIQLQELLALETALGGQCVD 
RQEVAEITKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


5648 


7 


1518 


VIiSSLCGRHEAIiREVGAEWPPPTCSPKICSGLQQAGNTDWSLTM 
APQSLPSSRMAPLGMLLGLLMAACFTFCLSHQNLKEFAIiTNPEK 
SSTKETERKETKAEEELDAEVLEVFHPTHEMQALQPGQAVPAGS 
H VRLNLQTGEREAKLQ YEDKFRNNLKGKRLD I NTNTYTSQDLKS 
ALAKFKEGAEMESSKEDKARQAEVKRLFRPIEELKKDFDELNW 
IETDMQIMVRLINKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 
LLSFGGLQWINGLNSTEPLVKEYAAFVLGAAFSSNPKVQVEAI 
BGGALQKLLVILATEQPLTAKKTCVLFALCSLLRHFPYAQRQFLK 
LGGLQVLRTLVQEKGTEV1AVRWTLLYDLVTEKNFAEEEAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQTLGVLLTTCRDRYRQDPQLGRTLASLOAEYQVLASLELQIX3E 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


j 3006 


mlqeqldaineeirmiqeekestelraeeietrvtsgsmeaLnl 

KQLRKRGS I PTSLTDLSLASAS PPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQEEGKSALEDQGSNPSSSNSSQDSLHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRR LKKKHQLLEDARR KGMP FAQ WDGPT WS WL 
ELWVGMPAWYVAACRANVKSGAI MSALS DTE I QRE IGISNALHR 
LKLRLA I QEM VS LTS P SAP PTSRTS S GNVWVTHEEMETLETS TK 
TDSEEGS WAQTLAYGDMNHE WIGNE WLPSLGL PQYRS YFME CLV 
DARMLDHLTKKDLRVHLKMVDS FHRTSLQYG I MCLKRLNYDRKE 
LEKRREESQHEIKDVLVWTNDQWHC7VQSIGLRDYAGNLHESGV 
HGALLALDENFDHNTLALILQI PTQNTQARQVMEREFNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFR VS TLGTLQP PPAPPECKIMPEAHSH YI* YGHMLSAFRD 


5650 
5651 


1172 


3006 


M LQEQL DA I NE E I RM I QEEKESTELRAEE I ETR VTSG SME ALNL 
KQLRKRGS I PTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTL PS DLR KHRRKLLS P VS REENREDKATI KCETS P PS S PR 
TLRLE KLGHPALS QEEGKS ALE DQGSNPS S SNSS QDSLHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
V? AKLGTQ AE KDRR LKKKHQLLE DARRKGM P F AQ WDG PT WS W L 
ELWGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
LKLRLA I QEM VSLTS PSAPPTS RTSSGNVWVTHEEM ETLETSTK 
TOS EEGS WAQTLAYGDMNHE W IGNEWLP S LGLPQYRS YFME CLV 
DARMLDHLTKKDLRVHLKMVDS FHRTSLQYG I MCLKRLNYDRKE 
LE KRREES QHE I KD VLVWTNDQ WHWVQS IG LRD YAGNLHESGV 
HGALI^DENFDHNTLALILQIPTQNTQARQVMEREFNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRV3 TLGTLQ P P PA P PKKI M PEAHSH YL YGHMLSAFRD 


5652 


646 


1869 


AKUGQRO P WG * EARAKG PASS S PRV * EGSG WEGPAS P * T PGSTL 
AWGEGAG I R* ASGLTAAGAAS AAAA/PPPTRGGPAPAGCGRAPP 
WPAPLRVPTHGRAPAPRS RAAPRAPALSHGTAAAALS PAS PAGP 
ADP*LPGHSSQSPPRG*RWGRSRSAPAPAHPEHPAPAGSASASQ 
QTPG WPGSCCLAQGWQAE PLGAPGAEDG \ PVPPQRGFPLGTLGS 
PAGSWAGLAGYG*AGAPGTQATAPRAAGQTPVAAAPNCRV+GSA 
PALHRAPAAAD PGS PLQAPPRAWAS PAAAG P GLS SSD Y CGGLGA 

GWRAGISPELLGAAGLSDNWARCPGPGPAE*GGQPGCRTIPASA 
CMPSPPVEGSLGLSRKGHGDLPSOAR*GWHECI?padht vdt ddt 
LGPRGRTGRPSS PS 


5653 


735 


" 343 j 
( 
I 


^.KKYQHIHQl^FSCPEPACGKSFNFKKHLKEHMKLHSDTRDYI 
2E FCARS FRTSSNLVIHRRI HTGEKPLQCEI CGFTCRQKAS LNW 
iQRKHAETVAALRFPCBFCGKRFEKPDSVAAHRSKSHPALLLA 




66 


1401 I 

C 

c 
c 


*GRLQSRGRLTLGLVLLLLDILGARQHGQRVSHGWKGGFLTAPL 
ZFPQPCQPGTRRGRRRSLKEATEPQIAMAEEFVTLKDVGMDFTL 
5DWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
.BPLAGGSPEATSPDVTBTKWSPLMEDFFEEGFSQEI/SRDVIQ 
WLLB LQFRRS L YRGHLVR » FARRS RKS S EV* Y CHQRGKSHGMQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PeProline, Qs=Glut amine, -R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Trypfcophan, Y=*Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\epossible nucleotide insertion) 








ES* IKERTQSCVHRFHGRRFHG\DNVSEKTLTP^y<S^PVPr'P'pi? — 1 

S YSDHSQQDS VQEG EKP YQCSE CGXS FSGS YRLTQH W I THTRE K 

PT VHQECEQG FDR KASH SG Y PKTHTG YKF YVCNE YGTP FS QSTY 

LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 

ECGKAFTRIFHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5554 


3 


598 


TLP L FPGRRFRG WRRCGAVAAR KNSTGGNVS I NQRR DS VRMS AL 

NWKP FVYGGTiA *J T TJi B" rf^T CD T riT TirTDDATnnnmKTnnvnrrciTT 1 

YRGMLHALTOIGREEGLKALYSG*VGLHAFLCHCSLFHMGIDFR 
PRLHRSQVXSLRCV* KEQIA* * /MFSLLISTLISKYI YYAADVL 
E KL FY Y IQVQTDKNKK I CLFKN I j 


5655 ~ 


2 


867 


RP PG I RAPRQLH PAAG RR PDASARPRFR PT VLLHDPFQL S F P PP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMIPFKDEGDPQXREKIFAEIVNPEEEGDLADIKSSLVNES 
ex x c-Aoxvumjo v/utyAy x c>y t»t» xHJU1UU<cIHPDJXjKHPI^GLYNKG f 
PSYSSYSGY IMMPNMNNDP YMSNGSLS PPI PRTSNKVP WQPSH 
AVHPLT PL I T YS DE H FS PGSHPS HI PS DVNS KQGM SRHP PAPD I 
PTFYPLSPGGGGQITPPLGWQGQP | 


S656 


228 


1066 


prrvpplpefasgpgaaffhsgrlqrsltkdsagcfsqcrsramH 

LVLRSGIjTKAI^RTIiAPQVCSSFATGPRQYDGTFYEFRTYYIjK 
PSNMNAFMENIiKKNIHLRTSYSELVGFWSVBFGGRTNKVFHIWK 
YDNFPHRAEVRKALANCKEWQEQS 1 1 pnlaridkqeteityli p 
Nbj^ijyj\f p KJj.t» v YE LAVFQM KPGG PAL WGDAFE RA I NAHVNLG Y j 
TKWGVFHTEYGELNRVHVLWWNESADSRAAWHXSHEDPISWG 
GVRESVNYL\VSQQNM j 


S6S7 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKIi 
FRERLSQLRXjRLEEVGAERAPEYTEPLGGXiQRSLKIRIQVAGIY 
KGFCLDVIRNKYECEXiQGAKQHLESE KIiIxLYDTLQGELQER I QR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 

pyivymlqeidiledwtaikkaraavspqkrksd\dldpavhsq 
gdpqsswhctqdsrlppadrrthrplrvcparllwccwalplhl 

ALVWTPPL 1 


5658 


2346 


3541 


terrvynpwpepdpd\ciqedpwnlpnsiktlvdniqryvedgk ' 
NQLLLALIjKCTDTELQLRRDAI fcgalvaavctfseqllaalg y 
rynnngeyeessrdasrkwleqvaatgvllhcqsllspatvkee 
rtmlediwvtlseldnvtfsfkqldenyvantnvfyhiegsrqa 
lkvifyldsyhfsklpsrleggaslrlhtalftkvlenveglps 

PGSOAAEDLOODINAOQT.RJn/<VWVD."irr dzvpvt ?dott rvrn^om 1 

tavkidqlirpinaldelcrlmksfvhpkpgaagsvgaglipis 
selcyrlgacqmvmcgtgmqrstlsvsleqaaiiarshgllpkc 

IMQATDIMRKO^PRVEIIAKNLRVKDQMPQGAPRLYRLCQPKfMN 
GDL 


5659 


2 


696 


WKRSGE VS P KGELGAWRGNSGRPKIIGRAAPARNRr>PTT ror r -o — 1 
GN ERS Q PRS PLR LLA PQLKAEAAADKGLAP VPP P FS SGHS GP C \ 
EREGEGQRG RGRS RRGAH LELKP S PGLRAGAPTDRGRGGPAE VA 
AAGGRRMVQKES QATLEERBSELSSNPAASAGASLE PPAAPAPG 
EDNPAGAGG\ AAVAGAAGGARRFLCGWEG PYOPP uvrrfpno vt»t 
FRRLQKWELNTYL | 


5660 


229 


853 


P VTM WAFS E DPM PLL INL IVS LLGFVATVTL I PAFRGH F I AAPTi* | 
CGQDLNKTSRQQIPESQ3VISGAVFLIILFCFIPFPFLNCFVKE 
QRKAFPHHEFVALIGALLAICCMIFLGFADDVLNLRWRHKLLLP 
TAASLPLLMVYFTNFGNTTI WPKPFRP I LGLHLDLGR* S YHCC 
PYGT YFRE PFLVLHILLQVFLFCLCVFPDP FW j 


5661 


2 


473 


LNLYPSPCGGIPKLPGLPREAAAALGASFLAEAPLPVTVRGSGL " 
AGMAVTCD P KAFLS I CFVTLVFLQLPLAS I CQN* GTDSCASRG K 1 
ADFDVTGPHAPILAMAGGHVELQCQLFPNISAEDMELRWYRCQP 
SLAVHMHERGMDMDGEQKWQYRGRT | 


5662 


2 


1318 ■ 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKI^VRDALGAQNASGERIKIQGWIRSVRSQKEVLF J 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fephenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L«Leucine, M«Methionine, N^Asparagine , 
P=Proline, QsGlutamine, R=:Arginine, 
S=Serine, T=Threonine, V= Valine, 
WwTryptophan, Y=TvrosiTie Xuiinlfnnwn +_q 
Codon, /-possible nucleotide deletion, 
Vpossible nucleotide insertion) 








LHVNDGSSLESLQWADSGLDSRELTFGSS VEVQGQLI KSPSKR' 
QNVELKAEKI KVIGNCDAKDPP I KYKERHPLE YLRQYPHPRCRT 
1 NVLGSII^IRSEATAAlHSPPiaiQGtrvHTW'rDTTTCMnc'pr'ar'o 

LPQLEPSGKLKVPEENPFNVPAPLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHIiAEFYMIEAEISFVDSLQDLMQVIEEIjFK 
ATTMMVLSKCPEDVEIiCHKFIAPGQKDRL*HMLKNNFIiI ISYTE 
1 AVEILKOASONFTF'TPEWGADTjPTFWP'KVT.VVWr'rnvTTOTrBnrTMv 

PLTLKP FYMRDNEDGPQELEGSVA*HSIiGLM I LLSI WIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSlX " 
VQSYFEItt3PLTFRDVAIEP^l 1 PPWnr , T.nc;B.nnr , T vuvt/mt c»tvtvt3 

NLVFLGIALTKPDLITCLEQGKEPWNI KRHEMVAKPPVICSIIFP 
QDLWAEQDI KDS FQEAI LKKYGKYGHANFQLQKGCKS VDECKVH 
KEHDNKLNOCL I PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTGYSLVQENGQRKYG 
GPPPGWDAAPPERGCEIFIGKLPRDLFEDELIPLCEKIGKIYEM 
RMMMD FNGNNRG YAFVT FSNKVE AKNAI KQLNN YEI RNGRLLG V 
CASVDNCRLFVGGIPKTKK 


5665 


347 


702 


WQHL I ILLHCERTSPAMITSELP VLQDSTNETTAHSDAGSELE ' 
Ultl *»wiutiwiWKJrwfcir*'o iW^JvirKlvoPeEKSRIEAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKIjGKSAMQRC 


5666 


213 


540 


vov.u.r lo^ftni J. A^WUJJUrVPr«5SnPDEYKIAALVFYSCIFII 
GLF VN I TALWVFSCTTKKRTTVTI YMMNVALVDLI FIMTLPFRM 
FYYAKDEWPFGEYFCQI LGA 


5667 


1 


695 


HPLPSASLGLPSVSLGVSLCVRSALLEAWPMLPKRRRARVGSP 

duwnnooArro J Rr F\jV.MJI X 1j Vliir'KfQUKoKKAr IjTGIjARSKGFR 
VLDACSSEATHVVMEETSAEEAVSWQERRMAAAPPGCTPPAIjLD 
ISWLTESLiGAGQPVPVE CRHRI»EVAGPSKGPI»S PAWMPAYACQR 

PSPVTTLSQLQ 


5668 


691 


894 


CS FLFCI PDIiFLOFLIiGHirFPRA vf.vnnwu cd ct nrr rir»/-\*rvrirk — 
VLVRTAI RCAQAQTGIDLSGCTKW 


5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPQAYTPFIYLHARKRRGEIGD ' 
ADSRFNDR YAHKSAQIiYFL YFVCW I FQDVY YFTI KEKNHFFF PK 

ARGAPTKYSGSPIGSPTTTPPTRPPSFNLHPAPHLLASMQLQKI, 
NSQ 


567 0" 


3 


373 j 


SSECIiTMAWIPLLLPLLILCTVSVASYEtAOPSSVSVdPGQTAK 
ITCSGDVIAKKYARWFQQKPGQAJPVLVIYKDTERPSGIPERFSG 
S TSGTTVTLTI SGAQVEDEAD Y FC YS ATDNFLWVF 


5671 


280 


524 | 


KFPPKKTPPHLGMESAITLWQFL^QLLLDQKHEHLICWTSNDGE 
FKLLKAKKV7AKLWGIiRKNKTNMNYDKI^RALRIjLFMT 


j 5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSTiYIRIVEGKNLPAKDITG^ 
SDPYCIVKVDNBPIIRTATVWKTLCPFWGEEYQVHLPPTFHAVA 
F YVMDEDALSRDD VIG KVCLTRDT I ASHP KG KFS L PSHTGL PS P 

WPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


ITVADQISHWSAGRIKNRTRIPEC!IH^^AATTY7nf3i>HTMi7r»T7c , i7 ~" 
KLS SQTLi I QAGDDEKNQR T I T VNPAHMG KAFKVMNELRS KQL LC 
DVMI VAEDVEI EAHR WLAACS P YFCAMFTGDMS 


5674 


17 


984 [ 
1 

1 


GGGSMEGES TS AVLS GFVLGAIAFQllL!NTDSDTEGFL LGE VKGE 
AKN S I TDS QMDDVE WYT ID I Q KYI PC YQL FS F YNS SGEVNEQA 
LKKILSNVKKNWGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DLVFLLLTPS I ITES CS THRLEHSLYKPQKGIiFHRVPL WANLG 
MSEQLGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLXEVHKIN 
EM YAS LQEE LKS I CKKVEDS EQAVDKLVKDVNR LKRE I EKRRGA 

QIQAAREKNIQKDPQENIFLCQALRTFFPNSEFLHSCVMSLKID 
MFLKVAVTTTTISM 


5675-- 


80 


753 

j 


EGSRRG PTR LARLS ARAGRI>HFP PGFS SRIiI HFRG VSECRRP PG 
KSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRWREPVRKVT 
CjLMVGLDNAGKTATAKGIQGEYPEDVAPTVGFSKINIiROGKFEV 
riFDUSGGIRIRGIWKNYYAESYGVIFVVDSSDEERMEETKEAM | 
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SEQ 
1U 
NO: 


Predicted 
beg inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, c=cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine P G=Glycine, 
H«Histidine, I»Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R-Arginine, 
S=Serine, T=Threonine , V°Valine, 
W«Tryptophan, Y^Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEMLRHPRI SGKP I LVIANKQDKEGALGEADVIECLSLE KLVNE 
HKCL 


56^ 


2 


930 


FVSSPPPRPVQPARPGGFGLSGRRSIiLCQVASTPAHVGVMRSPV 
RDLARNDGEESTDRTPLLPGAPRAEAAPVCCSARYNIiAILAFFG 
FFIVYALRVNIjSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
HNQTG KKYQWDAETQGW I LG S F F YG YI I TQ I ?GG YVAS KIGGKM 
LLGFGILGTAVLTLFTP I AADLGVGPL I VIiRALEGLGEGVTFPA 
MHAMWSS WAPPLERSKLLS IS YAGAQLGTVISLPLSG 1 1 CYYMN 
WT YVF YFFGT I G I FWFLLW I WIjVS DTPQXHKR I SH YE KE YI LSS 
L 


5677 


1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHIiVAAARAA 
VTAE THPL PLLAPIiAVCQSVKS PAACQVRPR PRAVAL PAALGG P 
GRSL PGXiTAATMSS FS E S ALEKKLSE LSNS QQS VQTLS L WL IHH 
RKHAGPIVSVWHRELRKAKSNRKLTFLYLANDVIQNSKRICGPEF 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSPPPKATEEKKSLKRTFQQIQEEEDDDYPGSY. 
S PQDPS AGPLLTEELI KAIiQDIiENAASGDATVRQKIAS I>PQEVQ 
DVSLLEKITDKSAAERLSKTVDEACLRNRGPGTS 


5678 


3 


593 


SSSPPSSTPSLPLPFYL1.LGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SLAE FTEQFNQLHNRRNENLQLGPLGRDPPQECSTFS PTDSGEE 
PGQLSPGVQFQRRQNQRRFSMBVRASGALPRQVAGCTHKGVHRR 
AAALQPDFDVS KRLS LPMBI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE"" 
DDAEVQQECLHKFSTRDYI MEPS I FNTX>KRYFQAGGS PENVIQL 
LSENYTAVAQTVNLLAE WL I QTGVE PVQVQETVENHLKSLLI KH 
FDPRKADS I FTEEGETPAWLEQMI AHTTWRDLFYKLAEAHPDCL 
MLN FTVKVGRVLEtiRRKVFMNVYFWLLVCFL 


5680 


258 


592 


RRLTSTSEKLQNR^SHTPLESLIHPQPSYKGFGIMFGiCKKKKIE 
ISGPSNFEHRVHTGFDPQBQKFTGLPQQWHSLIADTANRPKPMV 
DPSCI TPIQLAPMKTIVRGNKPC 


5681 


45 


869 


IXCAKTLiGVRTKESOAEG YNRSG INNHQAEDPRFCPSFCWMRSA "" 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFS QVMNMAAFLAL WAVI«K FI QLKPKVLNPWLNISGLVA 
IiCLAS FGMTLLGNFQLTNDEEIHNVGTSLT fgfgtltc w iqaal 
TLKVNI KNEGRR VG 1 PRV I LS AS I T1»C VG PLLHPHGPKHPHVCS 
QSPVGPGHVL 


$$82 


39 


622 


PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPI PS PGLSAQTGlT" 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGIiLLLVPLLLLPGSYGL 
PF YNG FY YSNS ANDQNLGNGHGKDLItNG VKIiVVETPEETIj FT YQ 
GAS V I L P CRYRYE PALVS PRR VR VKW W KLSENGAP EKDVI* VAIG 
liKHKs* r CiU YQGK VHLiRQD 


5683 


89 


778 


GSCGATALITRCLAWSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCT VCS KKFAS FNAYENHLKSRRH VEI»E KKAVQAVNR KVEMM 
WtJ^JUliKGlA3VDSVI3KDAMNAAIQQAI KAQPSMS PKKAPPAPAK 
E ARNW AVGTGGRGTHDRD P S EKP PRLQWFEQQAKKLAKHS EDD 
SEDEEHDLC 


5684 


195 


677 


T WCFRG YLGPR VI MKALDE P P YLTVGTl) VSAKYRGAFCEAK I KT 
AKRL VKVKVTFRHDSSTVE VQDDHI KGPLKVGAI VEVKNLDGAY 
QSAVINKIiTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 ■ 


779 


1262 


LLLQQPVVHCFLLFPPFRFSHHMIPGPPGPHTTG IPHPAiV'l'pO " 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQILGRRWHALSREEQAKYYELARKE 
RQLHMQ LY PGWS ARDNYVS PS S I P VALHS 


5686 


128 


1181 


CTWWQVNITIjIjDINDNHPTWKDAPYYINLVEMTPPDSDVTTVVA 
VDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF 
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SEQ 
ID 
NO: 



5687 



5688 



Predicted 
beginning 
nucleotide 
location 
cor re sp ond i ng 
to first 
amino a old 
residue of 
amino acid 
sequence 



17 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide' 
<A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L-Leucine, M-Methionine, N«Asparagine , 
P= Proline, Q=»Glutamine, R^Arginine, 
S=Serine, T=Threonine , V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *s=Stop 
Codon, /=possible nucleotide deletion. 



917 



. \~possible nucleotide 'insertion) 
" QNLPFVAE VLBGI PAGVS I YQWAIDLDEGLNGLVS YRMPVGMP ' 
RMDFLINSS SG VWTTTBLDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEDVPR\GSGWSG*AARN 
ND VGLNAELS Y F I TGGNVDGKFS VGYRDAVVRTWGLDRETTAA 
1 YML I LEA I PNG P VG KRHTGTAT VFVT VIiDVNDXR PI XLQS S YV 



~420- 



AAPPAPPDG/ PPP/PPPAPPT/ 1 PGPA A/APASSCQPRLSAGRAA ' 
QGDGGAAAVGH VLWPAVGPVRVNPGLQTP VPR PELLPG P \ S SS 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SG CRM P S TS A S E / AAGGQGACTHAKGS ETP PPAS P QTS E PAP S P 
LPPHLTGGPGMYSSEAKI,PNSFSCLGLAGTGAGI*GTASAHGTG 
PPVLPHVCTPSliANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 

SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 
VLDI 



LTKWDLFGKCYRLLKTGIEHGAMPEQVGVYWYS/CLYDSRKLF'F" 
*SHMI IRSLL*KVIDDSLGQLPLLRELLL* *LNVIDRCI ILAYV 

LRVEKTFAITYLKNFTVKVDFSLLGEIPLISMAAILKLWIMKID 
DGYIPAVF 



HELSGKHISMVSGNTCNWHPGGHS PGGGGQGE ITSKDRGEI PAL 
I WA/RK? I GTWTATKPTHRAG * GGAEE YQPP FQ P C EG PRS TSRG 
GEG*GHAVGPGRE I GKEGS LPFIiGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG \ S P LPACPP RAW P KAGAVAS ATGTG \ PQL PGS RGKQ 
KLPRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDS SAS \ 
PQTPGGRGSLEWGLPLYLGPHHDVK*RSDRLG+ PP * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL*RVPPGSIiGPSTQCMYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSASSSHR*GG*ERARAGAGHRGST*A 
S SKI EQGRPRPGPTSDALADVEGGABS/G PHPWPLPGTLPNR/ P 
G5PPPA*ASAGRKGTVS TLGGGLL 

PSPPAGVCAAPAPLPLIJUARRDRRPCSPGAEAAPWQTGGPAID" 
GAWRTSVSALRRGATG/APCSPGAEAAPWQTGGPAIDG\DGELP 
*VRSEEAPRGCGAEGGGPGSGPVRRPGAGRGAHAGQGRQQr>PEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAALP ERTRGVAE PPAWAHAGSDAWRAGR* SQRT * ERARPRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLSLLGP/ PGAHNLDTAPQDR* HGP*GDKRGAPG VAGEDPRP P * 
| GNFVR*LLLMP/GVA*RHGTSPFLGPSLGENGGQWDSGNIiFGTP 
! KG* SHPAFTKST * SMEAEKS YWNHPHR\DRGRQGVR INCLRVGE 

SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNS PGLLP 



5690 



1424 



58 



550 



"5*92 



1193 



~54B- 



ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGlLSFiED 
VAHRMLATGECTPEDLCFSLQ VMQ * KTGTESWG* RFYI VEQN* S 
GDAPLIFSPYIiSLTGNCGFAMLVEITERAMAH\CGSPGGPSLWG 
GVGVYVLLES VPLS YS 



5693 



1258 



"1330- 



tqawtraekdrkgsvralrlhlergppt*rgshpl\qsvpciqk' 

PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPL 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNLP VMGATRSNLQPPRKVAVPGPTR* RDQDS KQDFSS KPLQS 
VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 



1338 



ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP" 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSIiPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 



gskeparslhrrgsghkss agkwgsvtl£tagalg»kqXhq*wt" 

QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 

slaesglswfseseekapkkleydsgslkmepgtskwrrerpes 
cddsskggelkkpislghpgslkkgktppvav?spithtaqsal 
kvagkptokatdkgklavkotglqrsssdagrdrlsdakkppsg 
iarpstsgsfgykkpppatgtatvmqtggsatlskiqkssgipv 
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SEQ 
ID 
NO: 


Predicted ~~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predxcted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AuAlanine, C=Cysteine, D=Aspartic Acid, E>= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K» Lysine, 

P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQVRSLPRPAKSSSMsH 
VTGGRGGPRP VSSS I D PSLLS T KQGGLTPS R LKE PT KVASGRTT 
PAPVNQTDREKEKAXAKAVALDSDNISLKSIGSPESTPKNQASH 

PTATKIAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI * | 


5695 


3 


1338 


«^4viirrtt^^jjixfcit*jovnAjoo/^j\.vYiji> V"JL"Jj£>TA(iAL,G*KQLHQ*WT 1 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESBEKAPKKLEYDSGSLKMEPGTSKWRRERPBS 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KP VNGRKTS LD VSNS AE PG FLAPGARSN IQ YRS LPRPAKS S S MS 
VTGGRGGPRPVSSS IDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTD RE KE KAKAKAVALDS DNT SLKS I GS PEST PKNQ AS H 
PTATKLAE LP PTPLRATAKS FVKP P S IiANLDKVNSNS LDLPS S S 
DTTQCI 3 


5696 


3 


1338 


L»b^^iU<bIjHKX^i»UHKySA(JKWGSVTIjSTAGAIiG*KQLHQ*WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDS S KGGELKKPI SLGHPGSLKKGKTPPVAVTS PITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSnAKKPPSG 
IARPSTSGS FGYKKPPPATGTATVWQTGGS ATLS KIQKS SGI PV 
KPVNGRKTS LOVSNSAEPG FLAPGARSN IQ YRS LPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKE KAKAKAVALDS DN I SLKS IGS PESTPKNQASH 

v± AiKXJVELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALS PPACPSAPAPRRS I ISRLFGTS PATEAAP PPPEP VPAA H 

QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 

DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 

SSEEEAEVAAP7KGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 

x^j.«i^±'^w^LaL»viVKTOPEKRSSTRPPAEMEPGKGEQASSSESDP I 

EGPIAAQMLSFVMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 

DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 

EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 

RRQQRPPRSRERTAA ~ j 


5698 
5*99 


2 


666 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNKSliGPVSFKDVAVDFTH 

QEEWQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 

GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPS.RQTVFI 

ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 

NASSBYISSDGRYARMKADECSGCGKSLLHIKLBKTHPGDQAYE 




2 


1448 


rvrqppglwvrrwpamqcpaglsrvpgvag/dpslpsfrgprdH 

EAAHRGTIOTARHTRKLtYVOC5Pn.^r , DDT DDvcmvaTknc^nT > I 

RPS/GRTNAPFPQGQKPAGKAAPGPAAAGRVAMR\PGHPGLLAS 
DSQRSSSKGSGWETPVPWS * AQ PG WVSGLLLLGDPSGPGSL* RS 
TWLVGGARGPEGSGVRGSnwP<If3r'cnT^3MaT n/^TiTKmo rr nri »Tm 1 

WTQ KWTGE / S PAPGEEG\ VAPAPRGPTAEHGHCELTTESQ YSNN 
VP 1 LFQNPSGALRS RRTE PAG WVP PTRHE * DDG * TAAPAS GGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRLAAGAUTSASARCPPAAAA 
GWQPRRPGFAGRAALPGPPHPPSS*RELGGLPGPGW*TLDPLPA 
HPAHPPGS AP PWGALGGWAAARASLPWS PS LCLS FPAVTPVAGL 
FPPGRG 


5700 


923 


S97 


NGHKGVWEINIY*RRSNIHKNSKSESHLNQDHSFPPPTPNSARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 


410 

] 
3 


I FEK I CS DTQE FISPEINPQI CS WL t FD KGAK/ NHATGKDSLFN I 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK* IK/DANIRCETVKL 
L.EENTGENLHDTGLGNVFLDMTPKTQPTKQK | 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
correspondinq 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, OCysteine, D-Asparzic Acid, B= 
Glutamic Acid, F= Phenyl alanine, Gs=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=rMethionine, N=Asparagine , 
P= Proline, Q-Glut amine, R=Arginine, 
SsSerine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\~possible nucleotide insertion) 


" 5702 


. 3 


1517 

• 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 

ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPV2TPSRA. < 5P < > < ?A5?<;nf?PH'DVT , rDQl>&CTrcoxo otv>t it 

PVI TPSRAS ESSAS SDGPHPV1TPSWSPGSDVTLLAEALVTVTN 
IEVINCS I TE I ETTTSS I PGASDTDLX PTEGVKASSTSDPPALP 
DS TEA KPH I TEVTAS AETLSTAGTTE SAAPHAT VGTP L PTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSyVKVSGAA 
PVS I EAGS AVGKTTSFAGSSASS YSPSEAALKNFTPS ETLTMDI 
TT KG P FPT S RD PL PS VP PTTTNS S RGTNS TLAK ITTS AKTTM KP 
PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 
KG* CSSSTGNSTPTRLTSRSPYCVSGEANG/ PSAAARHVPlf AKR 

v»v*v-x- rvjr-rr x UV-3^V X VuNU iy K»v K.PL>TPDVATGPS 

LTS TGVY WGGASP VPRG VLGLT1*AH VLC FS KEKT 


■'5703™" 


14 


1117 


HHKDSRSQGLPRTQECARPELRPLLCPRALWPVTRLS YRCPWQA " ' 
P KAG IGTKAKP S ESHLKLHPGWP S LDRQGE PATLGTGTGHCSDS 
R 1 1»R WHP * HTAAR* PRW R RLP S S HRWTRHLGVLRVQD KS * * VSL 
ur^KfKfUK * iUWKoVASSSNPPPGWSGPGASVFPARPVS 
ALPTG PRC W * APRGRTRQPCGW PRLS S PHATADWGPGCPLS PSR 
GSWETAPGS * WCPWI* * AARWTG WRTAS GAS AGLGRAADRP S AWA 
RRVAGLL PGQGLTVRR *H * TAGAP AS VRS S QGATRS PAPGGDQC 

ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPU J SAARPRA 
AG W PRHS PHDTQTPEP 


S704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEA1SHEWI 
SGNAASDKNIKIX3VCAQIEKNFARAKWKKAVRVTTliMKRLRAPR 
QS S TAAAQS AS ATDTAT PGAAGGATAAAASG ATS APEGDAARAA 

KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5705 


23 


562 


GD YE FDS P Y WDD 1 S QAAKDL VTRLM E VEQDQR I TAEEAI SHE W I " 
SGNAASDKNIKDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPE 
QS S TAAAQS AS ATDTAT PGAAGGATAAAAS GATS APEGDAARAA 
KSDNVAPRRP * L P PQ PQME VPPQ PLMAVS PQ P PM EASLQ PLMGE 
SPQP 


5706 


1161 


610 


QLGRFXAQDTVAI RKVKE VFGTGAMRHWI I*FTHKED*GGQALD 
DYVANTDNCSLKDLVRECERRYCAFNNWGSVBEQRQQQAELLAV 
I E R LGR EREGS FHSNDLFLDAQLLQRTGAGACQ ED YRQ YQAKVE 

WQVEKHKQELRENESNWAYKAI^RVKHLMLLHYEIFVFLIiLCSI 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MFAI QPGLAEGGQ FIX3D PPPG LCQPELQ PDSKSNFMASAKDANE 
NWHGM PGRVE P I LRRS S S ES PSDNQAFQAPGS PE EG VRS P P EGA 
E I PGAE PEKMGG AGTVCS PLE DNG YAS SSLS I DS RS S S PE PACG 
TPRGPGPPD PLL PS VAQA 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWMLSPVSLGAAGWPGQPRPYLDlgPA 
Q AS VSRPHDRA* GEAVSLS I*S SGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLL IAIRYSD I PS DVSKAP \GPA 

GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGROGS(?f?*3vnnr:r'T>\ a dtij anr nnrrneir^^nn 

LLK* SDS P VKQLPA\SGQGSGAGMPP VGS SDILR PRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
SRRRRGP*AAGRSTPAVP*PCS*GGAGRRAYACRTGWGYAPSR* 
LEPSGPTSGSAL* TWASHSTGA* *SRLCGTAGTGPLCSQSSRS * 
AG * RCCCTAAS PCGGSG PSHPGS PSAHCLSWSGGRTQ PRAPS AH 
GRGRAMGSRCVCTCTGLPCPGI PLSGAS PGGSGETGAGRSHTLK 
AARSRLSPRPGSGSRGSY*SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 

PPRPPRLPAAAS/SGGASGSPA^SCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


S709 


2 


2C31 


ITLCPLPQTEKCIJJIVWEAATPLGIYLKARVEAGGLKELEISWG 
LHQI WRWGAWMRAGMGGCRCWGVMAPFAPR/NALS FLVNDCS 
L I HNNV CMAAVF VDRAGEWKIiGGLD YM YS AQGNGGG P PRKG1 PB 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C° Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M*=Methionine, N«Asparagine , 
PaProline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, +-Stop 
Codon, /«=possible nucleotide deletion, 
\«possible nucleotide insertion) 








LEQYDPPEIiADSSGRWREKRSADMWRLGCLIWEVFNGPLPRAA 
ALRNPG KI PKTLVPH YCELVGAN P K VR PN P AR FLQNCRA PGGPM 
SNRFVETNLPLEEIOI KEPAEKOKPPnPT..Q V <!T.nflPDPni7r , ouv 
VL PQLLTAFE FGNAGAWLTPLPKVGKFLS AEE YQQK 1 1 PWVK 
MPSSTDRAMRIRLU3QMEQPIQYLDEPXVNTQIPPHWHGFLDT 
NPA I REQT VKSMLLLAPKLNEANLNVELMKHFARLQ AKDEQG P I 
RCNTTVCLGKIGSYIiSASTRHRVLTSAFSRATRDPFAPSRVAGV 

L/3FAATWNT»Y^MMnr*JA<"iV Tf «DVT.fV2T.TtJnDJ?lf GtrurviT* t?vt\ top 
iJlJX *■ **iv-u .* & I'lxv ij***f\\J! is, A, Lie VJJL.L7.Lj 1 VJJfJSfto VKJJQAf KAlRb, 

FL S KLE S VS ED PTQLE E VE KD VHAAS S PGMGGAAAS WAG WAVTG 

VSSLTS Kb I RSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 

TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVIAQQD 

GAKIiPGATS * R YTAGQRV 


5710 


1 


562 


X PGSTI SCE VELMARMAKT I DSFTQNGTRLWI IDGLiDACEQDK 
VLQMLDTVRVLFSKGPFIAIFASDPHIIIKAINQNLNSVPSGFK 
\LNGHDYMRNIVHLPVFLNSRGL/RQ/LQENFS + LQCX5MBTFHA 
QIIiQG YRKKLTEEPHRTALGR*QNLVARQPS I DG* DAIGFELYV 
CIA I Q PNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAKAVLSQQRPSLFHECAFHF PS + SLQRHTI NLDQGI F* LLM 
LSEERQHLFESS/I WTTPHNLK* /PEIHEHJJGSHEGPIWTLFFIiIj 
QIL 


5712 


3 


1391 


GRKLFQSLDISERLKFLLTLDCVDDTLIVLAEEHGCLDIIKELP 
ETVIDIJ.NKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPFTKPA 
S LFSS S LR CADLTL PED I SQLCKD INND YLAERS I EE VYYLWCL 
nuuuuAA&u vin wsi xrvoivi'i'JLL. JL JLiJrJNr Jjr EDGES FGQGRDRSS / 
TFR*YHWDIVVMPAKK*IERCWGRSILPITLKMTSI*ILPYSNSN 
NELSAAATLPLI I REKDTE YQLNR 1 1 LFDRLLKAYP YKKNQ I WK 
EARVDI PPI^RGLTWAALIiGVEGAIHAK YDAIDKDTP IPTDRQI 

SLCAPFL YLNFNNEALVYACMSAFI P KYDYNFFLKDNSHVI QE Y 
LTVFSQMIAFHDPEIiSNHLNE IGFI PDLYAI PWFLTMFTHVFPL 
HK I FH L W \ DTLLLGEFLFP 1 1» YWE 


5713 - 


634 


284 


P VCAVPVDRWPVLPREDQEGQQL* AKLPRDFRR* FQ I LGPMEGH 
TACRCSRRGAQVQHLPREDIRAAE *DPHIjREVWPGLPTSSATS P 
* RAVLTSPCSHLGSADAASSHWIiCGVSFH 


5714 


212 


613 


WGIiGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KS AWAAAAPAS VADDTP P P ERRNKS G 1 1 S E PLNKS LRRSRP L S 

HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKX.ELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPASKLQGGGGGLQTGWGLHPVPVTAASPIjPRWCLFGAVAKV 
GLPGP*3jCPSGAA/ GGLQRGPGLS PLGAAGKVSCIiHPPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS 
PRPPGS / SDLDGPRPO^IHLRAFPAAHGGPVNTPHGGEEKTFMSS 
QIRRKETKPL*RKTPAG\NNYQSNSIPVSQSPQLTVDLLPSAGR 
TQAPSGRGDAGKPTPGHG \LPKAS VILTPNCPCSLAGGQ* PPGL 
YPKTPKQRRWRRPL /LLGPSO*GSROSTC? + F\A rat/:pdud t nr> 

L*PDLSCILSNGSKHRREGL8FPRSLGPGRRGPAGLQSIiGCSPT 
PKNTACHS SGHVALQAGHDSARDVGSGHVALQAGHDS TQDVGRP 
VWRWIPIiE* LGLSRETGQATRRGLVWIS PGRAAAACVACAQALE 
EGPLRI,PGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GLT/GVPGTDPKRGGRKPGQSGQETQGPTVWSGPESPLQPKP*E 
RQE / VGAGASSGVGLS RGRAGGPSSAWBVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEHTD ' 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL* SGFFTI I VGG YS CCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSE YGKCGPRGLVPEGESTS PLP£SVDTED£LD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A=Alanine, OCysteine, D»Aspartic Acid, B= 
Glutamic Acid, Fs Phenyl alanine, G»Glycine, 
H^Histidine, I=lBoleucine, K»Lyaine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glu tannine, R=Arginine, 
S=Serine, T=Threonine, V=*Valine, 
W=Tryptophan, Y=Tyxosine, X=Un known, *=Stop 
1 Codon, /^Dossible nucleoHrip Holoh{^n 
\=possible nucleotide insertion) 








GPSLGARPGLtPYGIiSDDESGGtjRATjQARQTgVPT?Dap/^Dr'-^7\T3r»P — 

RPGPACQLCGGPTGEGPCCGAGGPGGGPLLPPRLLYSCRLCTFV 
SHYS S HLKRHMQTHSGEKPFRCGR CP YA S AQL VNLTRHTRTHTG 
j EKPYRCPHCPPACSSLGNLRRHORTHAGPPTPor'DT*rvziri>r'r''pr> 

RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG 
Q \ CG VKGRAS AGLDQNHCQS / S L FPWT CRG CGQELE BGEGS RLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGBKPYKCPL 
f CPYACX5NIANLKRKGRIHSGDKPFRCSLCXJYSCNQSMWLIRHM 


5718 


120 


284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S**STADPLHL 


5719 


48 


428 


ELNNGPFQMPIjCNGGNI^VTGSWADRSPI^KAASQGRLLALRTL 
1 LSOGYNVNAVTTinHUTDi.ucnnT /TMiirnniinni r t-it> .tihti. — _ 
i wwAiiv v a uun vir Lit! e>AtU7Un vALAK vL»L E AGANVNAI T 

IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 


1051 


f lqafrnasevpmvlvgtqdaisaa\nprvyrrtsrarklstdlk~~ 

» ftV * 1 \ i xci \ i v_lhjJ. XvjIjUMWSVSFQDVAQKVVAL\RKKQG;\IiAI 

gpck\slpn\spsh\savsaasiparapinqghs/sgggsafsd 
y\sssvpstpsisqrelrietiaasstptpirkqskrrsnifts 

RKGADP\DREKKAAGCKVDSIGSGRAIPIKQGILLKRSGKSIvNK 
EWKKKY VTLCDNGLLTYHPSLHD YMQtf IHGKEI DLLRTTVKVPG 
KRL PRATPATA PGTSPRANGIjS versntqlgggtgaphsas s as 
LHS E R P LS S S AWAGPRPEGLHQRS CSVS SADQWS EATTS L P PGM 
OHPASG 


5721 


97 


492 


k«£> t> ±* L»KKT BKSSNAa V a t / TTVQQFKRF I ENYRRH I G C VA 
VF YA rAGGL FIiE RAY Y YAFAAHHTG I TDTTRVG 1 1 LSRGTAAS 1 
SFM FS Y I LLTMCRNL J TFLRETFLNR YVPFDAAVD FHR L I AS TA 


5722 


88 


1043 


VALDVIAGSS PGGGMAGALLGPR VHGI RAVLRVARGGVQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQG<2GGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
tr r if »r if nuioAu j aus t>EERQS QPRAETLRIjGRGAPLP \ PRAERGG 

RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5723 


88 


1043 f 


VALDVlAGSSPGGGMAGALLGPRVHdlRAVLRVARGGVQAPGAP - 
GS LG VS HAAAP PAR PQGAAQS PHRGRRHGGGGAGLP P PRS PRF P 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLOALTAGSGEERQSQPRAETLRLGRGAPLPVPRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI " " j 


5724 
5725 


3 


1841 

I 


FTNEAPPAPL,^DASASPLSPHRRAKSLDRRSTEPSVTPDLLNFK~ 
KGWLTKQYEDGQWK30IWFA1ADQSLRYYRDSVAEEAADLDGE ID 
LS AC YD VTE YP VQRNYGFQ I HTKEGE FTLS AMTSG I RRN W I QTI 
MKHVHPT TAPDVTS SLPEFKTJK «5 *3 r Q PPTT'IHj Tyrn xr/-M? t\ dt ^ t. 

DPEQKRSRARE\RRREGRSKTFDWAEFRPIQQALAQERVGGVGP 
ADTH\DPWRPEAEHGELERERARRREERRKRFGMLDATDGPGTE 
DAALRMEVDRS PGL PMS DLKTHNVHVE I EQR WHQ VETTPLREE K 
Q VPI A P VHLS SEDGGDRLSTHELTSLLE KELEQSQKEAS DLLBQ 
NRLLQDQLRVALGREQSAREGYVLQATCERGFAAMEETHQKKIE 
DLQRQHQRELEKLREEKDRLLAEETAATISAIEAMKNAHREEME 
RELEKSQRS Q I SS VNSDVEALRRQ YLEELQ S VQRELEVLS EQ YS 
2KCIjENAHLAQALEAERQ ALRQ CQRENQ E LKfAHNQBLNNRXiAAE 
tTRLRTLLTGDGGGEATGSPLAQGKDAYELEVPSGARPCLTQLC 
rQEPQGSAAWPLS Y R WGGTDLRQQESQG PGRS KS PE GGE EQ 




3 


1049 \ 
J < 

I 

1 I 


/WGHSEETSQSPNRTEP!-lDSDCSVDLGISiCST5DLSPQKSGPVG 
» WKSHS I TNMEIGGL KI YD I LS DN\ DLS SHLQPLK/ FTSAVDG 
CNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
>E/GAKYNKRPHKWAHNLHLKYMVLHS1ISNTVAV\RSQRHFVA 
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SEQ 
ID 
NO: 


1 PredicLed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

| sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=rCysteine, D=Aspartic Acid, E=s 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H=Histidine, I=»Ieoleucine, K=Lyaine, 

L°Leucine. Mtn^p h Vi -i r*»n i no \t„7 V< -, t -.-. v .^^,< 
** «w**\*A4ic j i i^i'jc u u J-Utii iitj f ^inAsparag me , 

P= Proline, Q«Glut amine, R=*Arginine, 

S=Serine, T=Threonine, V=Valine, 

W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 

CO don • / =DOSSl hi f=» nnrl onh { Ho Hal Al*<iAw 
* * / -yuooxuic uuwiCUUXUc QeJLcCIOIIi 

\=possible nucleotide insertion) 








LQTKS PNRPCOFS SS APS7 VDnP an J tmo qva rj'c tv'tomm p o kttjxt — 

NVRANTA YHLHQRLG PARHGEMW AI S PNDRLI PAVTRS TIQRQS 

SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 

SQRPLSARTYSIDGPNASRPQSARPSINBIPERTMSVSDFNYSR 
TSP 


5726 


! 2 


486 


| SRSbSMWWNSGLPASSHSS KLP VTVGFSGCVKRLRLHQRPIjGAP 
| TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 

GLI FHLGQARTPP YLQLQVTEKQVIiLRADDG | 


5727 


21 


221 


RPILIUKETRRJbPWATGYAEVINAGKSTHNEDQASCEVLTVKKK 
AG AVTST PNRNS S KRRS S L ?NGE j 


5728 


2 


877 


GTR NGQFE PRRGRAWEGS AGGLRAPGAAAGGPGVQPRGSG / LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG/I»PGRCPSAPWRAGSRPAAS CPDWIPGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
BPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHIAEGGA 
KGSPRRLADPQDLPAGQMSLAPPFPPVAAVIRSNK 1 


5729 
5730' 


j 1 


1525 


AGG ARE VLTLy LGHFAG F VGAH W WNQQDAALGRATDS I&2 P PGEI» 
CPDVI.YRTGRTLHGQETYTPRLII1MDLKGSLSSLKEEGGLYRDK 1 
QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KS I PNGKGS S PLPTATTPKPLIPTEAS I RVWSDFI,RVHLHPRSI 
CMIQKYNHDGEAGRLEAFGQGESVIiKEPKYQBELEDRLHFYVEE 
CD YLQG FQI LCDIiHDG FSG VGAKAAE LLQDE YSGRG 1 I T WGLLP 
GPYHRGEAQRNIYRLLNTAFGIiVHLTAHSSLVCPIiSUSGSLGIiR 
PEPP VS FPYLH YDATLP FHCSAILATAliDTVTCS \ YRLCSS PVS 
MVHL\ADMLSFCGKKVVTAGAIIPFPLAPGQSLPDSLMQFGGAT 1 
PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 
LHACTTGEBIIAQYLQQQQPGVMSSSHIiLLTPCRVAPPYPHLFS 
SCSPPGMVLDGSPKGAAVESVPVFG | 




1258 


1713 


KKFQ APARETC VE CQKTVY PME RLIjANQQVFH I S C FRCS YCNNK 1 

r/^5i^HorKX tCKJr'HFNQLrKS KGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENCGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSSSREKGGQASWN PKLRVA j 


5731 


122 


443 


RSHRGELIPKDSCYMRKPPRRPkKRRQG/CALPQGCLTFKDVAI | 

e« 0 uci&nf\iiwiii'«riiUKftL X ftAVnUr.M X KNLinoVGiiTSKDSW YMRK 1 

KPGRGRGKQRRQEWFFLRVY 1 


5732 


226 


772 


PPSRSCQSPKRKSRRRAHVTVTIjVCGFrSFSFSLPLYIjCGCLRF 
PERTCS QIiQQADWAPDFGPS S F VPS WGATATGARKFL I AFN I \N 
LLGTKEQAHRIALNIiREQGRGKDQPGRLKKVQGIGWYI/DEKNLA 
QVS TNLLD FEVTALHT VYE E TCREAQELS LP WGS QLVGLVPLK 
ALLDAA 


5733 


1 


460 


PALQE VNA5IALAWGKQYENDARTLFEFTSGVNDTES P 1 1 YRDES| 
« u\int,urwuv. ^*-»v7Jwoij*iJji\.^.xrr lsKL/f PIKr KJjCavJr EAI KSAYM { 
AQVQ YS MW VTRKNAW Y FAN YD PRMKREGLH YWI ERDE KYM \ AS 
FDE I \ VP \ E F IGKMDE VLS RD PM 


5734 


3 


968 


RCNS PESI>TSLLVLLTTANKLFVL1 PAYSKNRAYAI F? 1 VFTVl! " 
GSLFLMNLLTAI I YSQFRG YLMKSLQTSLFRRRLGTRAAFE VLS 
SMVGEGGAFPQAVGVKPQNLLQVLQKVQLDSSHKQAMMEKVRSY 
GS VLLS AE EFQK L FNE LDRS WKE HP PRPE YQS P FLQS AQFL FG 
HYYFDYLGNLIAIiANIiVS I CVFLVLDADVLPAERDDFI LGILNC 
V FI VY YLLEMLLKVFALGLRG YLS YP S JTVFDGIjIjTVVLIjVL E I S 
TL\VCTDCHTQAGGRRWW/RLLSLWDMTRMLNMIiI VFRFLRI IP 
SMK PMA WAS TVLGL j 


5735 
j_ 5736 j 


2 
1 


<J40 
3 82 


FFTPCVARAFNFPDQATVKKAAYSLPRVGGGTSCGLPQARRISL | 
ATPRQLYK/ SSNMTQRWQRREI SNFE YLMFLNTIAGRTYNDLNQ 
YPVFPWVLTNYESEELDLTLPGNFRDLSKPIGAIiNPKRAVFYAE 
RYETWBDDQS PP YH YNTHYS TATSTLS WLVRI VS I F IELACLWY 
LKILT j 

GTRPSTKKSGYSPO^VAVIHCKGHQKENTAVAHSNQKADSAAQV 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
anr.ino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide"" 
(A=Alanine. C=Cysteine, D=Asoartie Acirf v- 
Glutamic Acid, F=Phenylalanine, Gs=Glycine, 
H-Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P^Proline, Q=Glutamine, R«Arginine, 
S»Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLS VTPPNLL PT VS FPQ PDL PDNP V YSTTTE KLAS D LRANKN 
QES * * I LPDSGI FI P * T*TSYLQSTTHLRRAKLPQIiLRR 


5737 


290 


1041 


KACLHLLSSPLTSNFLFNPLLPDSLYSVEARSQRANLGPCRRKR 
LQTLMRLAAG FQ YS S H KDPS LS AKEKKTD YHNEARG P WPGWVG * 
RTADGS CGRG PDGAHH PG PKS S S WRAS RIiLPGLGGS HHLDAYVG 
RDLECGT P A PLQLB I P P QPRGHPAP I PTGQAG PRDS G PG AS P * V 
ETRPLTDGRR * PGVRPVGWTPAHPAGTLRPRGAVE PSVSACGKW 
APSPTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


8 


4^0 


DTLS LNCT LPETLPMTPS F* LS FL * FPGLARAKS I PTKT YSNE V 
VT L WYR P P D I L LGS TDYS TQ I DMW * GOVE VWQG PCG KGGGL VTT 

ATQPAAFLFTVPSLPRGVGCIFYEMATGRPIiFPGSTVEEQLHFI 
FR I ZjS EE A WALCAVE THR 


5739 


1 


1222 


SFQRRGIRWNVHTLHPHPRAVWAGIGRGHGS*AttGRARAPALC 
FPTLLEFLES LEPDLPALRAMGLHLWAAGPGTHPAGI SDIiLABV 
SAEVDGPVPGYIiSSPOSTTDTfT.VTRTQPTTv^T dvbed t cttt 

LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGQFWBDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HK VR LAVGSGLRP DT WER F VRR FG PLQVL ET YGLTEGNVATIN Y 
TGQRGAVGRAS WL YKH I FP FS L 1 RYD VTTGE P I RDPQGHCMATS 
PGEPGLLVAPVSOOSPFLiGYTVnnPP'T.nnmrT t vnupDnnnTrDDM 
TRD L LVCDDQ G FLRFHDRTGD P FRW KGENVATTE VAE V FE ALD F 
LQEVNVYGVTV 


5740 


265 


231 


PAYWiKVPTLCIiESKTDl.R RKJi QHvc hot 2\r* cudpt */-»*t t ,,wu.-» — 

YVYERVYN*NISRMVHALEQKIUIPAGI^SSMALQLNPCLGMI>IA 
LQSELHKLYDEETQSWVSGSACGGYP 


5741 


1 


650 


PRKTMRRGVLMTLLQQSAMTIiPLWIGKPGDRPPPLCGAIPASGD 
YVAR PGDKVAARVKAVDRnprno T T » a in/vc v c ixa *txtv-vdx it> tv i- nn 

EGKERHTLS RRRVI PLPQWKANPBTDPEALFQKEQLVIAL YPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTSYADGYSPPLNVAQRYW 
ACKE P KKK* CRliADS PS PNDTGQDS RGRAG I KH I PPLKKK 


5742 


2 


362 


TQSVKEILKRNP^^VNLTDKDGNTALMIASKEGHTEIVQDLLDAG 
TYVNI PDRSGDT VLIGAVRGGHVE I VRAIiLQKYADIDIRGQDNK 
TAL YWAVEKGNATM VRDI LQCNPDTEI CTKDG 


5743 


2 


415 


GKTPEGIDAIEEIEIDLEETEREISPQENGLEEVKPLGEMQTDL 

KATGREISPREKTPEVIDATEEIDKDLEETGRREISPEENGPEE 

VKPVDEMETDLKTTGREGSSREKTREVIDAAEVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTRQMTTTPAAIjPTTVVTTPDLTTGTPLQMrTIA 
VFTTANTCLiSIjTPS TLPEEATGXiliTPE PS KEGP I LTAESETVLP 
S DSWS S AESTS ADT VLLTS KES KVWDLPS TSH V«5 MWKTQncwcc 
PQPGASDTAVPEQNKTTKTGQMDGI PMS MKNEMP I SQLLM I IAP 

SLGFVLFALFVAFLLRGKLMETYCSQKHTRIiDYIGDSKNVLNDV 
OHGREDEDGLFTL j 


5745 


1400 


599 


UKSRFVNIjMKHSKKT YDS FQDELEDY I KVQKARGIiEPKTCFRKM 
KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 
VENRLPQWLPAHDSRLRIiDSLSYCQFTRDCFSEKPVPLNFNQQE 
YrCGSHGVEHRVYKHFSSDNSTSTHOASHKOIHOKRKHHDWP^n 
EKSEEERS KHKRKKS CEE IDIiDKHKSl QRKKTEVE I ETVHVSTE 
KLFQNIRKEKKSRDVVSKiCEERKRTKKKKEQGQERTEEEMLWDQS I 

trGF 


5746 
5747 


3 


821 


S FASGRLTPS S PA FDGELDIiQR YSNGPA VS AWS LGMGAVS WS E S 
RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 
LEERALLREARLGRARSSGGMQATPATEGloARPOAPSSSAFRCP 
YCKGKFRTS AERERHLH I LHR PW KCGLCS FGSSQE E ELLHHS LT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 

EATPTPAPAAPEEPPAPPEFRCQVCGQSFTQSWFLKGHMRKHKA 
SFDHACPV 




2 


1328 ] 
] 
] 


3RHVETTiC I HFLG PS TGS TAKTGGRNWLKTGNCL YGNTCRFVHG ~ 

?SPRGKGYSSNYRRSPERPXGDLRER1KNKRQDVDTEPQKRNTE 

5SSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H«Histidine, I«Isoleucine, K=Lysine, 
L-Leueine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, RssArginine, 
S=*Serine, T»Threonine, V«Valine, 
W» Tryptophan. Y*»Tvrosine. X= Unknown * _qi- nn 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDItfYDYVHELSIiEMKRQKIQRELMKLEQENMEKREEIIIK 
KEVS PEWRS KLSPSPS LRKSSKS PKRKS S PKSS SAS K VHP PCTci 

AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIAI^KKYKE 
KYKVKDRIEEKTRDGKDRGRDFERQRBKRDKPRSTSPAGQHHSP 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREYEODO^^KnHRnryRTTDDr^onD 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGLQPSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGSGAGVISKTLTYPLDLFKKRLQVGGFEHARAA 
FGQVRRYKGLMDCAKQVLQKEGALGFFKGLSPSLLKAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS ~ 

L i JJrtiiCKi'jyoiiyiKRljKJ^bljboyiSAVAl Li I oQLiSANAN 

LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAVIQGALNASETTPKELRI KRQNSSDS ISSLNS I TSHSS I 
GSSKDADA 


5750 


22 


866 


IF I S I CIjWKAHLCFLLI*P KDCI DQ VMKLQNLFVDDS GR YLA IQ F 
IHjEWAYVFLYYYEYRKAKDQLDIAKDISQLQIDLTGALGKRTRF 
V"*' * vnyuiuuviuuwjuv ueNLCtM irAJrTt'yBnLTKNLELNDDT 
ILNDI KLADCEQFQMPDLCAEEI Al I LGI CTNFQ KNNP VHTLTE 
VELIAFTSCLLSQPKFWAIQTSALILRTKLEKGSTRRVERAMRQ 
TQ ALADQFED KTTS VLERLKI F YC CQVP PHWAI QRQLAS LLFEL 
GCTS S ALOI FE E MWE 


5751 


3 


751 


SCGS ALRAWRCGAAALiAT FPAPALPGLMYRAL YAFRS AEPNALA 

FAAGETFLVLERSSAHWWIiAARARSGETGYVPPAYTjRRIiQGLEQ 

DVLQAIDRAIEAVHNTAMRDGGECYSLEQRGVLQK1» IHHRKETIiS 
RRG PS ASS VAVMT <;ct cdupt .naanawn DMrarro a r» cfnnn t- n 

SSEHLGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 

> 


471 


GPVCGVGLiSVAWAC3PWRrtPVHQV/f3fir:f2WAZi.T uraPT err cr>h >Trf; — 
VEREMELRHKNEMLRVETEARARAKAERENADI IREQIRLKASE 
HRQTVLES I RTAGTLFGEGFRAFVTDRDKVTATVN I FIKQGWQV 
AERQHVGAS WS PRSCPCRIjCTALi 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRNI YTPRTGHRIRKLDQ I QSGGN YVAG 
GQEAFKKLNYLDIGEIKKRPMEVVNTEVKPVIHSRINVSARFRK 
PLQEPCTIFLIANGDLINPASRLI.IPRKTLNQWDHVLC2MVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHVVEFAGEHAEAIASREQEVX»QGWKELI^ACEDAR1jHVSST ' 
ADALRFHS Q VRDLLS WMLX5 IAS Q I GAADKPRCPS S LLGLPAS P W 
WPTPATPSPLTAPFSME 


5755 


3 


888 


LGDQF YKEA IEHCRS YNS RLCAE RS VRL P FLDSQTG VAQNNC Y I 
WMEKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDPKLRLLE I K 
PBVEljPLKKDGFTSESTTLEALLRGEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEEDI PKRKNRTRGRARGSAGGRRRHD 
AASQEDHDKPYVCDICGKRYKNRPGIjSYHYAHTHLASEEGDEAQ 

dqbtrsppnhrnenhrpqkgpdgtvipnnycdfclggskmnkks 
grpeelvs cadcgrsahlggegrkekeaaa 


5756 


3 


621 


SS KLQAIiFAHPLYNVPEEPPLIiGAEDSLLASQEALR Y YRRKVAR 
WNRRHKMYREQMNLTSLDPPLOLRIaEASWVQFHLGINRHGLYSR 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLiOJVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDR I LDFRRVPPTVGR I VNVTKEI L 


5757 


3 


473 


YKDALLLPDNHRQWFENGTI.KLTDVQKGMDEGEYLCSV1.IQPQ 
LS ISQSVHVAVKVPPL IQPFEFPPAS IGQLLYI PCWSSGDMPI 
RITWRKDGQVI ISGSGVTIESKEFMSSLQ I S SVSLKHNGN YTCI 
ASNAAATVSRERQLIVRVPPRFW 


5758 


1 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRLAVS RTDGTVE I YNLSANYFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGB IMEYDLQALNI KYAMDAFGGPI WSMAAS P 
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SEQ 
ID 
NO: 



5759 



576G 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G -Glycine, 
H«Histidine, I»lsoleucine, K-Lysine, 
L=rLeucine, M=Methionine„ NaAsparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«,Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pogsible nucleotide insertion) 



1240 



1221 



SGSQLLVGCEDGSVKLFQITPDKIPV 



gnaafagqgvvyetfhmsplpsyttngt vhVvvnnqigfttdpr 

MARS S P YPTDVAR WNAP IFHVNADDPEAVI YVCS VAAE WRNTF 
NKDVGADL VC YRRRGHNEMDE PMFTQPLM Y KQ I HRQVP VLKK YA 
DKLIAEGTVTLQEFEEEIAKYDRICEEAYGRSKDKKILHIKHWL 
DSPWPGFFNVDGEPKSMTCPATGIPBDMIiTHlGSVASSVPbEDF 
KIHTGLSRII^GRADMTKNRTVDWALAEYMAFGSLLKEGIHVRL 
NGQDVEKGTFSHRHHVLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
S S LS E YG VLG FELG YAMAS PNAL VLWEAQ FGD PHNTAQ CXI DQF 

ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKPF5VSQL 

VRDITSDSLSLSWWPEGQFDKFLVQFKNGDGQPKAVRVPGHED - 
GVTISGLEPDHKYKMNLYGFHGGQRVGPVSAVGLTAPGKDEEMA 
PAS TE PPTPE P ? I KPRtiEE LTVTDATP DSLS LS WTVP EGQ FDH F 
LVQYKNGDGQPKATRVPGHEDRVTISGLEPDNKYKMNLYGFHGG 
CRVGPVSAIGVTAAEEETPTPTBPSMEAPEPPEEPLLGELTVTG 
SS PDSLS LS WTVPQGRFDS FTVQYKDRDGRPQ WRVGGEES EVT 
VGGLE PGRKYKMHLYGLHEGRRVGP VS T VG VTAPQED VDE T PS P 
TEPGTEAPEPPEE PLLGELTVTGSS PDSLS LS WTVPQGRFDS FT 

VQYICDRDGRPQAVRVGGQESKVTVRGLEPGRKyKMHLYGIiHEGR 
RLGPVSAIGVT 



5762" 



344 



"5763 



429 



SCDMAEAAALVWIR3PGFGCKAVRCASGRCTVRDFIHRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ IEKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRX^RLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAAS S KM VSAE I SENR KRQ WPTKS QTDRGAS AGKRRCFWLGM 
EGLETAEGSNSESSDDDSBEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGSQRARWNTDHGSPEQIiQIPVTDSGRHILEDSCAELGESK 
EHMESRMVTETEETQEKKAESKEPIBEEPTGAGLWKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCELMALGLK CGGTLQ 

GSTGQTPLH^gUGGGGSGGGRR^TPR GMPKEKYEPPDPRRMYTI 
MS S EE AANGKKSHWAE LE I SG KVRSLS ASLWSLTHLTALH LS DN 
SLSRIPSDIAKLHNLVYLDLSSNKIR 



LD KDTG L IML I ARLD YEL I QR FTLT I IARDGGGEE TTGR VR I N V 

LDVNDNVPTFQKDAYVGALRENE PSVTQL VRLRATDEDS PPNNQ 

ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYL 
TVMAMDAGN" 



441 



5765 



vcakacgemrqllrpidrqr ydenedlspveeivsvrgfsleek; ^ 

LRSQL YQGD FVHAMB GKDFN YE YVQREALR Vp LI FRE KDGLG I K 

MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKL 



825 



"576T- 



1608 



QKILRLNNSHQPPTSSSNSKDCGGPASSGAGATAALADGLKFAS 
VQASAPQGNSHKETSKS KVKRS KTSKDANKSLPSAALYGI PEIS 
STG KRQE VQGR PGEATGMNSALGQS VS SGGSGNPNSNSTS TSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPSGGHLYGFGAKSNGGGASPFHCGGTGSGSVAAA 

GEVSKSAPOSGLMGNSMLVKKEEEEEESHRRIKKLKTEKVDPLF 
TVPAPPPHV 



663 



S767 



892 



SGLFSVDP AS SQAMBLSDVTL I EGVGN EVM WAG WVL I LALVL 
AWLSTYVADSGSNQLLGAIVSAGDTSVLHLGHVDHLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGBAGAGGGVEPSLEHLLD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKSKYFPGQESQMiaiYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLM VP VFWLLG VVWYFR INYRQ F FTAPATVS LVGVTVFFS FL V 
FGMYGR 

NFRATPRPPTRPELRTGTEVILWYLD WRALMKRKRMKANIKLVG" 
SG FPL PSS D LDD S LTEE I DE K1G FRNDANFDWQNVADFRDAGGS 
LTEVKVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENELPDFP 
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1 C'PA 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
*-oirx espoxiaincf 
to first 
amino acid 
residue of 

amino an'H 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D»Aspar tic Acid, E^ 
Glutamic Acid, Ft= Phenyl alanine, G=Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=*Methionine, N=Asparagine , 
P«Proline, Q-Glutamine, R=Arginine, 
S =Serine, T=Threonine, V^Valine, 
W»Tryptophan, Y=Tyxosine, X= Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








HIDEFFTi.bJSTPSRSAVDEPHLLVNIEKQKIiBLEKRRI 1 ,DIEAER 
LQVEKBRLQI E KERLRHLDMEHERLQLBKERLQIERE KLRLQI V 
NSEKPSLBNELGQGEKSMLQPQDIBTEKIiKLERERJ^QLEKDRLQ 
FLKFSSEKLQlEKERLQVEKDRLRIQKEGHIiQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRIjGSAVTSQRAGPA " 
AAMVAKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNiCRVKPL 
SRVTSLANLIPPVKATPLKRFSQTLQRSISFRSESRPDILAPRP 
WSRNAAPSS TKR RDS KLWSETFDVC 


5769 


38 


£67 


TKTKKGVKEKATDQSVKAFAEHCPELQYVGFMGCSVTSKGVIHIi 
TKLRNLSSLDIiRHITELDNETAMEIVKRCKNLISIiNLCLNWTIN 
DRCVE VXAKEGQNLKEIiYLVSCKITDYALI At GR YSMTI ETVDV 
GWCKEITDQGATIjIAQSSKSLRYLGLWRCDKVNEVTVEQLVQQY 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVKTRKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTIiA 
FASQLKKTSLSLTPDVPEADIiSEVDPKIiVSNLMPFQRAGVNFAI 
AKGGRLLLADDMGIiGKTIQAICIAAFYRKEWPLLVWPSSVRFT 
WEQAFLRWLP SLS PDC INVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGS DHSS I/3LEQLQD YM VTLRSKLG PLE I QQ PAMLLRE 
YRLGL P IQD YCTGLLKLYGDRRKFLLLGMRP FI PDQD IGYFEGF 
LEGVG IREGG I LTDSFGRI KRSMSSTSASAVRS YDGAAQRPEAQ 
AFHRLLADITHDIE 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPLPGVLLSIiSGGLFRSNLLTQDNG 
ILTFSNLVTCSAI YHLPVFPERE PGCSMRDLRVA 


5773 


2 


723 


PRVRSKHNFCFMEMNTRLQVEHPVTEMI'IXSTDLVEWQLRIAAGE ' 

K I PLSQEE I TljQGHAFE AR I Y AE DP SNNFMP VAGPLVHLS T P RA 

DPSTRIETTT^QGDEVSVHYDPMIAKLVVWAADRQAALTKLRYS 

IJ?QYWXVGLHTNIDFLLNLSGHPEFEAGNVHTDFIP<3HHKQI*LL 

SRKAAAKESLCQAALGLILKEKAMTDTFTIjQAHDQFSPFSSSSG 

RRLNISYTRNMTLKDGKNSK 


5774 
~ 577S 


: 5 ■ 


592 


FVEEENIRWRCGGSELNFRRAVFSADSKYIFCVSGDFVKVYST 
VTEECVHILHGKRNLVTGIQLNPNWHIjQLYSCSLDGTIKLWDYI 

dgi l i ktfi vg c klhaxi ftiaqaeds vfvtvnke kpd i fql vs v 
klpksssqevbakelsfvldyinqspkciafgnegvyvaavref 
ylsvyffkkettsrvtlsss 




3 


538 


ssgccdpaapsslaeaatmpvskcpkksbslwkgwdrkaqrngl"" 
rsqvyavngdyyvgewkdnvkhgkgtqvwkkkgaiyegdwkfgk 
rdgygtlslpdqqtgkcrrvysgwwkgdkksgygiqffgpkeyy 
egdwcgsqrsgwgrmyysngdiyegqwendkpngegmlrlsqnp 

RP 


5776 


2 


484 


ri^dcvcqnlsesl^tlcpskgi^fvppdidrrtvelri^gnf 

1 1 HIS RQD FANMTGLVDLTLSRNTI SHI Q P FSFTjDLES IjRSIjHIj 
DSNRLPSLGEDTLRGLVNLQHLIVNKNQLGGIADEAFEDFLLTL 
EDLDLS YNNIiHG P AVGLRGDAW VQPS TS 


5777 


2 


949 


GQDPEPGQDJjFQPEREVDPSWGRGREPRLGKLRFQNDHLSVI#KQ 

vkkleqalkdgsagldpqlpgtcysphcppdkaeagstlpenlg 

GGSGSEVSQRVHPSDLEGRBPTPELVBDRKGSCRRPWDRSLENV 
YRGSEGSPTKPFINPI*PKPRRTFKHAGEGDKDGKPGIGFRKBKR 

nlpplpslpppplpsspppssvnrrlwtgrqkssadhrksyefe 
dllqsssessrvdwyaqtklgltrtlseenvyedildppmkenp 

YEDIELHGRCIjGKKCVIiNFPAS PTQQTDTYtt.tvoct c v-oa uonrt 
NSERRNV 


5778' 


1 


1210 

< 

] 


qrrqsvsrlllpvflleppaepglepppeeeggepagvaeepgs 
ggpcwlqiieevpgpgplggggplrspssyssdelspgepltspp 
wapi^aperpehllnrvlerlaggatrdsaasdillddivlths 
lflptekflqeijiqyfvraggmegpegi-grkqaclamllhfldt 

Y0X3L1^EEEGAGHI1KDIjYLLIMKDESLYQGLREDTLRI,HQLVE 
rVELKIPEENQPPSKQVKPriFRHFRRIDSCLQTRVAFRGSDEIF 
^RVYMPDHSYVTIRSRIiSASVQDILGSVTEKLQYSEEPAGREDS 
kILVAVS SSGEKVLLQ P TBDC VFTALG INSHLFACTRDS YEALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=*Phenylalanine, G-Glycine, 
H^Histidine, I=Isoleucine, K»Lysine, 
L^Leucine, M»Methioninc, N=Asparagine, 
P=Proline, Q«Glutamine, R»Arginine, 
S^Serine, T~Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVSPGDTEIHRVEPEDVANHLTAPHWELFRCVH3LEPV 
DYVPHGE 


5779 


138 


1S71 


EAVQVLIKHSADVNARDKNWQTPLHVAAANKAVKCAEVIIPLLS 
S VNVSD RGGRTA1jHHAAIjNGHVEI»IVNLLI»AKGAN inafd kkdrr 
ALHWAAYMGHLDWALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NWKHLLNLG VE I DEI NVYGNTALHI ACYNGQDAWNEL ID YGA 

nvnqpnnngftplhpaaasthgalclellvnngadvniqskdgk 

S PLHMTAVHGR FTRS QTL I QNGGE I DCVD KJOGNTPLH VAARYGH 
EIXINTLITSGADTAJCCGIHSMFPLHIiAALNAHSDCCRKLLSSG 
QKYS I V3 L FSNEHVLS AG FE I DT PDK FGRTCLHAAAAGGNVEC I 
KLLQSSGADFHKKDKCGRTPIiHYAAANCHFHClETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCLEFLLQNDANPS IRDKEGYNS I HYAAAYGHRQCLE LLLE 
RTNSG F3BS DSGATKS PLHLAVSEMP 


5780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEEXSEPVS 
EIETSWKGSHFPVGWPPRAKSPTPESSTIASYVTLRKTKKMM 
DLRTER PRSAVEQLCIAE5TRPRMTVE EQMER IRRHQQACLREK 
KKQliNVIGASDQS PLQS PSNLRDNP 


5781 


19 


941 


RGSLGGHPWRP^MRAASQGCIjPVSFVTUPHQERAYGGRGPGGAF" 

PAPPVSGTCPPDZ»IYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 

QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 

VaPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 

QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPLP 

VAPSRTCGGC*TWDPALLVSP/PQGDSTPELPAP\QQPTGGPSR 

CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 


1237 


DRSMMSMAADSYTDSYTDTYTEAYMVPPLPPEEPPTMPPLPPEE 
PPMTPPLPPEEPPEGPAliPTEQSAIiTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPESSITLTPVESAWAEEHEWPERPVTCKVSETPAMSAEPT 
VLASB P PVMSETAET FDSMRAS GHVASE VSTSIiLVPAVTT P VLA 
ESI LE P PAMAAPESSAMAVLE 3 SAVT VLESST VTVLE SSTVT VL 
BPS WTVPEPPWAEPDYVTI PVPWSALEPSVPVLEPAVSVLQ 

psmivsepsvsvqestvtvsepavtvseqtqviptevaiestpm 
i less ims shvm kgi nls sgdqnlape i gmqe i alhs gee phae 

bhlkgdfyesehgiwidlninwhliakemehntvcaagtspvge 
igeekilptsetkqrtvldtypgvseadagetlsstgpfalepd 
atg\tskgi efttastlslvnkydvdls lttqdtehdmlists p 
sggseadiegplpakdihldlpsninlvssdtneplpvkrd\dq 
tlaali\sl:<essggekevppps*rehlpdsgfsaniedinead 
lvrpvssprtwnvlpspragii\egp\llasdfgpvqnlysspw 
\ssmp\erasgs\ssgekgg\yeifvkvkdthekskxnknrdkg 

E KE KKRD S S LRSRS KRS KSS EHKS RKLTS E5RSRARKRS SKS KS 
H RS \Q1'RSRSRS /RDRRRRSS RS RSKSRGRRS VS KE KRKRS PKH 
RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 
S VGRRRS FS I S PSRRS RTPSRR S RTPSRR SRTPSRRSRTPSRRS 
RT PS RRSRTPSRRRRS RS WRRRS FS I S P VRLRRSRTPLRRR FS 
RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 
GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 
DDDVIVNKPHVSDEBEEEPPFYHHPFKLSEPXPIFFNLNIAAAK 
PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
KDDDNVFS SNLPS EP VDI S TAMS ERAJjAQ KRLS ENAFDLEAMS M 
LNRAQE R I DAWAQLNS I PGQFTGS TGVQVLTQEQLANTG AQAW I 
KKDQFLRAAPVTGGMGAVLMRKNGWREGEGLGXNKEGNKEPILV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 

KRRWQPPEFIiLVHDSGPDHRKHFLFRVLINGSAYQPNCMFFLNR 
Y 


5783 


1693 


698 J 
< 
J 
I 


DSGLR VAFTMEG I SNFKTPSKLSK KKkS VLCSTPTIN I PAS P FM ~ 
2KLGFGTG VNVYiiMKRS PRGLSHS PWAVKKINP ICNDH YRS VYQ 
<RLMDEAKILKSLHHPNIVGYRAFTEANDGSLCI»AMEYGGEKSL 
TOLIEE/PI*SQ/PKILFQQP/LILKVAI^MARGLKYLHQEKKL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 
I residue of 

amino acid 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K~ Lysine, 
L^Leucine, M^Methionine, N«Asparagine, 
P=Proline, Q=Glut amine, R«Arginine, 
S= Serine, T=Threonine, V*» Valine, 
W«Tryptophan, Y=Tyrosine, X»Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGD I KSSWWI KGDFET I KI CDVGVSLPLDENMTVTDPEACYI ~ 
GTEPWKPKEAVEENGVITDKADIFAFGLTLWEMMTLSIPHINLS 
NDDDDEDKTFDES DFDDEAY YAALGTRPP INMEELDES YQKVI E 
LPS VCTNEDP KDR PS AAH1 VEALETDV 


S784 


2669 


1388 


PRVRPRVRTDHNY YI SRI YG PS DSASRDLWVNI DQM E KD KVK I H " 
G I LSNTHRQAARVNLSFDF P F YGH FLREI TVATGGFI YTGEWH 
RMLTATQYIAPIWANFDPSVSRNSTVRYFDNGTALVVQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKE I PVLVTQISSTNHPVKV 
GL S DAF WVHRI QQI PNVRRRT I YE YHR VELQiMS K I TNI S AVEM 
TP L PTCLQ FNRCG PCVS S QI GFNCS WC S KLQRCS SGFDRHRQDW 
VDSGCFBESKEKMCENTEPVET\ FLEP PQP * 3RQPPSSGS* LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VGI L I LVL I VATAIL VTVYMYHH PTSAAS I FFI ERRPSR 
WPAMICFRRGS GH PB YZl P VPDVr* T? vcr* c» t xra r»r\ r* 


5765 


2669 


1388 


PRVRPRVRTDHNYYISR^YGPSDSASRDLWVNIDQMEKDKVKIH 
G I L SNTHRQAAR VNLS FD FP F YG H FLR E I T VATGG P I YTGEWH 
RMLTATQ Y I AP LMANFD P S VS RNS TVR Y FDNGTAL WQWDH VHIi 
QDNYNLGS FTFQATLLMDGR I 1 FGYKE I PVLVTQISSTNHPVKV 
GLSDAFVWHRIQOIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGP CVS SQIGFNCSWCSKLQRCS SGFDRHRQDW 
v wovjurciiOMi 0\l r lK,CtVi I r*¥v a I \ i* L»h»P± , QP*ERQPPSSGS*'LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGL I VGI LI LVLIVATAILVTVYMYHHPTS AASIFFI ERRPSR 
WPAMKFRRGSGHPAYAEVE PVGEKEG FIVSEQC 


"S787 


2532 


1674 


SYKLPAAERRASSCSQPPTPTRRRWPAPGRTSRGHRPQM*SGTP~~ 

APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 

S * H * KRNLSQRSS SMSRRPLSCARPHR* * RQGLTVAARL PT*WAK 

SPPLACSFCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 

SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 

GRSMMTCPTRWTATPWSARASSRPRNWPTP * V7RPSGRLSTV* RA 

TGGSTATAPPKRFPRNWNPMMAB 




2 


1460 


MAS AASVTSLADEVNCP \ ICQGTLKEAGSLSNCG/HKNFCRACL 
i \« ji>.e,±r \Wi'D\jjEBSP\TCP\I»CKEPFRP\GSFRPNWQLANV 
VENIERLQLVSTLGLGBEDVCQEHGEKIYFFCEDDEMQLCWCR 
EAGEHATHTMRFLEDAAVAPYREQIHKCLKCLIKEREEIQEIQS 
RENKRMQVLLTQVSTKRQQVISEFAHLRKFLBEQQSILLAQLES 

qdgdilrqrde fdllvagei crfsali eeleeknerparblltd 

IRSTLIRCETRKCRKPVAVSPEIiGQRIRDFPQQALPLQREMKMF 

leklcfeldyepahisldpqtshpklllsedkqraqfsykwqns 

PDNPQRFDRATCVLAHTG1TGGRHTV7WSIDLAHGGSCTVGVVS 

edvqrkgelrlrpeegvwavrlawgfvsalgsfp\trltlkeqp 

RQVRVSLDYE VGWVTFTNAVTREP I YTFTAS FTRKVt PFFGLWG 
RGSSFSLSS 


5788 


2 


6860 

( 

1 1 
1 ' 


ehsvsgrssaygdataeghpagpgsvssstgaistWghqegdg 

SEGEGEGETEGDVHTSNRLHMVRLMIiLERLLQTLPQLRNVGGVR 
AI P YMQVI LMLTTDLDGEDEKDKGALDNLLSQL I AELGMDKKDV 
S KKNERS ALNE VHL VVMRLLS VFMS RTKS GS KS S I CES SSL I S S 
ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 
HTTSS P PDMSPFFLRQ YVKGHAAD VFEAYTQLLTEM VLRLP YQ I 
KKITDTNS R I P P P VFDHS WF YFLSE YLM IQQTP F VRRQVRKLLL 
F I CGS KE KYRQLRDLHTLDS \ H VRG I KKLLEEQG I FLRAS WTA 
S PQSALQ YDTL I S LMEHL KACAE I AAQRT INWQKFC I KDDS VLY 
FLLQVS FLVDEGVS PVLLQLLS CALCGS KVLRALAASSGSSSAS 
SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQEDQ 
LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 
IYRNSSKSQQELLLDLMWS3WPELPAYGRKAAQFVDLLGYFSLK 
rPQTEKKLKEYSQKAVElLRTQNHILTNHPNSNIYNTLSGLVEF 
DG YYLESDPCLVCNNPEVPFCY I KLSS I KVDTRYTTTQQWKLI 
3SHTISKVTVKIGDLKRTKMVRTINLYYNNRTVQAIVELKNKPA 
^WHKAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 
rETLQCPRCSASVPANPGVCGNCGENVYQCHKCRSINYDEKDPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A~Alanine, C«Cysteine, D^Aspartic Acid, E*. 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
K=Histidine, I^Isoleucine, K=* Lysine, 
L=Leucine, M^Methionine, NsAsparagine, 
P=Proline / Q-Glutamine, R=Arginine, 
S«Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y-Tyrosine, X- Unknown, *=Stop 
Codon, /^-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








lcnacgfckyarfdfmlyakpccavdpieneedrkicaVSnIntLi 

LDKADR VYHQLMGH R PQLENLLC KVNE AAPE KPQDDSGTAGG I S 
STSASVNRYILQLAQEYCGDCKNSFDELSKriQKVFASRKELLE 
YDLQQREAAT ICS SRTS VQ PT FTASQ YRALS VLGCGHTS S TKC YG 
CASAVTEHCITLLRALATNPALRHILVSQGLIRELFDYNLRRGA 
AAMREE VRQLMCLLTRDNPEATQQMNDL I IGKVSTALKGHWANP 
DLASSLQYEMLLLTDS ISKEDS CWELRLRCALSLFLMAVNI KTP 
WVENI TIiM CLRI LQKL I KP PAPTS KKNKD VP VEALTT VKP Y CN 
EIHAQAQLWLKRDPKASYDAWKKCLPIRGIDGtfGKAPSKSELRH 
LYLTEKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVJbFTPAT 
Q AARQAACT I VEALAT I PSRKQQVLDLLTSYLDELSIAGECAAE 
YLALYQKLITSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 
TLSTDLQQGYALKSLTGLLSS F VEVES I KRHFKSRLVGTVLNGY 
LCLRKLWQRTKt>IDETQDMLLEMIjEDMTTGTESETKAFf4AVCI 
ETAKRYNLDDYRTP VFI FERLCS IIYPEBNEVTEFFVTLEKDPQ 
QEDFLQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDIiVAIiLED 
D S G ME LL VNNK I 1 SL D LPVAEVYKKVWCTTNEGE PMR I V YRMRG 
LLGDATE E F I E S LDS TTDEEEDE BE V YKMAG VMAQCGGLE CMLN 
RLAGIRDFKQGRHLLTVLLKLFSYCVKVKVNRQQLVKLEMNTLN 
VMLGTLNLALVAEQES KDSGGAAVAEQVLS 1 MEI\ ICAEPNVEP 
LSEDKGNLLLTGDKDQLVMLLDQI NSTFVRSNPS VLCGLbRI I P 
YLS FGEVEKMQI LVERFKP YCNFDKYDEDHSGDDKVFL\ D CFCK 
IAAG I K\NNSNGHQL\ KDL \ I LQKG I TQNALD\ YMKKH I P /SAA 
RIWDADI\WKSFCLRPALPFILRLLRGLAIQHPGTQVLIGTDSI 
PNLHKLEQVS \SDEG IGTLA\ENL\ LESLREHPDVNKKIDA\AR 
RETRAEKKRMAMAMRQKALGTLG \MTTNEKGQWD/TRTALLEA 
DWEELI EEP\GLTCX:iCREG YKFQPTKVLGI YTFTKRWLGGVW 
ENKPRETSRATSTVSHFNIVHYDC\HLA\AVSLARGREEWESAA 
LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 
TYQLNIHDIKLLFLRFAMEQSFSADTGGGGRBSNIHLI PYI IHT 
GLYVLNTTRATSREEKNLQGFLEQP KEKWVESAFEVDGPYYFTV 
LAI»H I LP P EQWRATRVE I LRRLLVTS QARAVAPGGATRLTD KAV 
KD YS AYRS SLLFWAL VDL I YNMFKKVPTS NTBGGWS CS LAE Y I R 

HNDMPIYEAADKALKTFQEEFMPVETFSEFLDVAGIiLSEITDPE 
SFLKDLLNSVP 


5789 


1 


2467 


LPLHAVE ktgr pg q pal km pgklrs dag LESDTAMKKGET lrkq 
TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
L KGRG VTFIjF P I QAKTFHHVYSGKDL I AQARTGTGKTFS FAI PL 
I EKLHG\ ELQDRKRGRAPQVLVLAPTRELANQVS KDFSDITKKL 
SVACFYGGTPYGGQFERMRNGIDILVGTPGRIKDHIQNGKLDLT 

KTiNHWIiDEVDOMT.r>Mf2Pfl nrtVITPTr CTr&V7VnQQn»nnmr t t-«<-i 

' UJ1 '" v w ""'jvwyi ijjiyriijtx'/tijy v EtCiJLLio VAX ivivUo CiDNPQJ. JjLFS 
ATCPHWVFNVAKKYMKSTYEQVDLIG KKTQ KTAI T VEHLAI KCH 
WTQRAAVIGDVIRVYSGHQGRTIIFCETKKEAQELSQNSAIKQD 
AQSLHGDI PQKQRBI TLKGFRNGS FGVLVATNVAARGLDIPEVD 
LVIQSSPPKDVESYIHRSGRTGRAGRTGVCICFYQHKEEYQLVQ 
VEQKAGI KFKRIGVPSATEI 1KASSKDAI RLLDSVPPTAISHFK 
QS AEKL I EE KGAVEALAAALAHI SGATS VDQRSL INSNVG FVTM 
ILQCSIEMPNISYAWKELKEQLGEEIDSKVKGMVFLKGKLGVCF 
DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 

REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRS FS KAFGQ 


5790 


3786 


1585 


ARRQRDPLQALRRRNQE LKQQVDSLLSB SQLKEALE PNKRQH I Y" " 
QRCIQLKQAIDENKNALQKLSKADESAPVAWYNQRKEEEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEEE KEENESHKWSTGEE Y I AVGDFTAQQVGDLTFKKGE I 
LLVI EKKPDGWW1AKDAKGNEGLVPRTYLEPYSEEEEGQES SEE 
SSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAISEAGIFCLVN 
HVS FCYL I VLMRKRMETVEDTNGS ETGFRAWNVQSRGR I FLVS K 
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! SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
(A^Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 

1 H=Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, [^Methionine, N=Asparagine, 
P^Proline, Q=Glutamine / R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, ^-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, j 

J \-possible nucleotide insertion) j 








PVTiQQINTVDVLTTMGAIPAGFRPSTLSQLLfcEGNQFRANYFLQH 
PELMPSQLAKRDIjMWDATEGTIRSRPSRrSLILTLWSCKMlPLP 
GMS IQVLSRHVRLCLFDGNKVLSNIHTVRATWQPKKPKTWTFS P 
QVTRIL P CLLDGDCFI RSNS AS PDLG I L FELGI S YI RNSTGERG 
ELS CG W V FL KLFDASG VP I PAKT YELFLNGGTP YE KG I E VDP S I 
SR RAHGS VF YQIMTMRRQPQL LVKLRS LNRRSRNVtjS LLPETIi I 
GNMCS IHIiL I FYRQ I LGD VLL KDRMSLQS TDLI SH P MLATFPML 
IiEQPDVMDAIiRSSWAGQES\TLKRSEKR\PK3FLKVPRFtiLVYH 
XGCVLPLL/HTPTRIiPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQALLSPDGVHEPFDLSEQTYDFLGEMRKNAV | 


5731 


3 


1634 


LRVAEFAGTSR/IGAGLIQPLHRAPARDHGLLRGGAAPALSVSH H 

GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
I NRSIOOGFCFNXrjPVrtPTrSTi^ifC'rr TriTT TrKi»PN.Tc«i2'r-»vT3i<?>i^TTT?j-»^ I 
i * •nw^v^vta viiiAuv.vujiiuiuAoiijiUibfWlNlJfilJiESSHFCP 1 

NVKLKAQTYELQESNVQLKLTIVNTVGFGDQINKEESYQPIVDY 

IDAQFEAYLQEELKIKRSLFTYHDSRIHVCLYFISPTGHSUCTL 

DLLTMKNLDSKVYIIPVIAKADTVSKTELQKFKIKLMSEIiVSNG 

VQ I YQFPTDDDTI AKVNAAMNGQItP FAVVGSMDE VKVGNKMVKA 

RQYPWGWQVE>IENHCDFVKLREMbICrNMEDLREQTHTRIIYEL 

YRRCKLEEMGFrDVGPENKPVSVQETYEAKRHEFHGBRQRKEEE 

mkqmfvqrvke:<eailkeaerelqakfehi.krlhqeermkleek 

RRLLEEEIIAFSKKKATSEIFHSQSFZATGSNLRKDKDRKNSQF 

fvkqkvpehrrsssqanfikkklevcfdfavicfitsifgeqpq 
llifmekyfqvqgqyisqse 


5792 


2263 


653 


1 AAAAPSPAWWCGVFVVYWHTCWVMYGIVYTRFCSGDASCIQPY H 

UU^PKI^L\RHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVSVPKKTRNNGTIiYAVTP'T uunrm DuunnvAimr mm 

YMVPKPEE I NLLTGB S DTQQI EADKKPTSALDE P VSHWRPRLAlj 
NVMADNFVF DGS S IjPAD VHR YKKM I QLGKT VHYL P IL F I DQI»SN 
RVKDLMVINRSTTEL PLT VS YDKVS LGRURFW I HMQDAVYS LQQ 
FGFSEKDADEVKGIFVDTNIjYFIALTFFVAAFHLLFDFLAFKND 
ISFWKKKKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTI FWRGLMPEFQFGTYSESERKTEE Y 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMP TSHRLACFRDD WFIiVYL YQRWL YP VDKRR VNEFGE S YE 
EKATRAPHTD | 


5793 


2263 


653 


aaaapspawwcgvfwyvvhtcwmygivytrpcsgdasciqpyH 

LARRPKLQL\RHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVS VP KKTRNNGTtiYAYI FLHHAGVLPWHnfJTfrtl/UT ucot tt 

YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 

NVMADNFVFDGSSLPADVHRYMKMIQLGKTVHYLPILFIDQLSN 

RVKDLMVINRSTTEGPI.TVS YDKVS LGRLR F W I HMQDAVYSI*Q Q 

FGFSEKDADEVKGI FVDTNLYFLALTFFVAAFHLLFDFLAFKND 

ISFWKKKKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 

AGVGAAI ELWKVXKAIiKMTIFWRGLMPEFQFGTYSESERKTEEY 

DTQAMKYLS YLI» YPIiCVGGA VYSLLNI KYKS W YS WL INSFVNG V I 

YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5794 


1 


5016 

] 
C 
I 


MGPRLSWbbijLPAALIiLHEEHSRAAAKGGCAGSGCGKCDCHGSr] 

KGQKGERGLPGLOGVIGFPGMQGPEGPQGPPGQKGDTGEPGIiPG 

TKGTRGPPGASGYPGNPGLPGIPGQDGPPGPPGIPGCNGTKGER 

GPLGPPGLPGFAGNPGPPGLPGMKGDPGEI LGHVPGMLLKGERG 

FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKGQM 

3LSFQGPKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 

BPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 

STPGLIGRQGPXqgEKGEAGPPGPPGIVIGTGPIjGEKGERGYPGT 

PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 

*GFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 

3DQGPPGIPGQPGFIGEIGEKGQKGESCLICDXDGYRGPPGPQG I 

a PGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK | 
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(A^Alanine, C=Cysteine, D^Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Iieucine, M=Methionine , N«=»Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
SeSerine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


j 5795 






GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 

PKGSPGSVGLKGBRGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 

DKGQAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPGF 

PGPQGDRGFPGTPGR\PGL\PGBKGAVG\QPGXGFPGPPGPKGV 

DGIiPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGI»KGIi 

PGLPGI PGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGP PG 

IiPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 

FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 

SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 

P3LKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 

EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 

GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 

AGPPGIGIPGLRGEKGDQGIAGFPGSPGEKGEKGSIGIPGMPGS 

PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 

TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 

DKGS KGEVGFPG1AGSPGI PGSKGEQGFMGPPGPQGQPGLPGSP 

onrticu*' ^«^^«-rW«UFt»LiPGJjPGPMG?PGLiPGIDGVKGDKGNP 

GWPGAPGVPGPKGDPGPQGMPGIGGSPG ITGS KGDMGPPGVPGF 

QGPKGLPGIiQGIKGDQGDQGVPGAKGIiPGPPGPPGPyDIIKGEP 

GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 

PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFIi 

VTRHSQTIDDPQCPSGTKIIiYHGYSLLYVQGNERAHGQDLGTAG 

SCLRKPSTMPTJTilPOXITIvrKnr^'kTt?!^ pnxmt/n «mt i-imi-iTiT^i . « » 

ov.unAc ^ *.i cuc\-n inn yi-ric AiKKDiSYWlrSTPEPMPMSMAP 
ITGENIRPFISRCAVCBAPAMVMAVHSQTIQIPPCPSGWSSLWr 
G YS FVMHTS AGAEGSGQAIiAS PG S CLE E FRS AP FIE CHGRGTCN 

YYANAYSFWLATIERSEMFKKPrPSTLKAGELRTHVSRCQVCMR 
RT 




1192 


61 


STRSPTVEYXSAHPHILFMIiLKGYEAPQIALRCGIMLRECIRHE 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
V7^FI:EQNYDTIFEDYEKLI^S2NYVTKRQSLKLLGELILDRHN 
FAI MT K Y I S KPENLKLMMNLLRDKS PNI QFEAFHVFKVFVAS PH 
*^ A w^ ■*■ iiJiiiutyrAJjAfit Jjior QKERTDDEQFADEKNYLIKQI 
RDLKKTAP+RALRDSKR 


5796 


2 


1078 


GRVGWELWCMYISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKF 
FGEIGIiIiDPGMDVYGGENIEI,GIKVWLCGGSMEVLPCSRVAHIE 
RKKKPYNSNIGFYTKRNALRVAEVWMDDYKSHVYIAWNLPLEKP 
G I D IGDVS ERRALRKSLKCKNFQ W YLDHVYPEMRR YNNTVAYGE 
LRNNKAKD VCLDQG PLENHTA IL YPCHG WG PQLAR YTKEG FLHL 
GALGTTTIjLPDTRCLVDNSKSRLPOr .T.nmwre qt vitd u wprn 
NGAIMNKGTGRCLEVENRGLAC IDLILRSCTGQRWTI KN3 1 K*R 

EGAGALEPGPQDMAAPPNlWTSCPGGETARGRQVTiDGPPRASPG 
QHRDPG 


5797 


2 


891 


PR VRQKT LVD VTIiENSN 1 KDO I RNLOOTYKA<>MnKT pvktvdot t? — 

VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 

KHSAEKEALLEETNSFLKAIEEANKKMQAAEISLEEICDQRIGEL 

DRLXERMEKERHQLQLQLLEHETEMSGELTDSDKERYQQLEEAS 

AS LRERIRK LNDMVHCQQ KKVKQMVEE I E S LKKKLQQKQLL I LQ 

LLEK I S FLEG ENNBLQSRliDYLTETQAECTE VE TRE IG VG CDLLP 

SQTGRTREIVMPSRNYTPYTRVLELTMKKTIjT 


5798 
5799 ' 


644 


115 


K1LGSRWKSMSNQEKQPYYEEQARLSKIHLEKYPNYKYKPRPKR"" 
TC I VDGKKLR I G3 YKQLMRSRRQEMRQFFTVG QQPQI P I TTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGSLAGNEMINGEDEMEMYDDYEDDPKSDYSSEMEAPEAVSAN 


1 


2679 


1435 

] 
J 
I 
I 
I 


LLSTYIKFINIjFPETKATIQGVLRAGSQIjRNADVEIjQQRAVEYL 
TLSSVASTDVTiATVLEEMPPFPERESS ILAKLKRKKGPGAGSAL 
DDGRRDP SSND INGGME PTPSTVS TPS PS ADLLGL RAAP P PAAP 
PASAGAGJJLLVDVFDGPAAQ PSLGPTPEEAFLSPGPEDIGPP I P 
SADBLLNKFVCKNNGVLFEKTQIiLQIGVKSEFRQNLGRMYLFYGN 
CTS VQFQNFS PTWHPGDLQTQLAVQTKR VAAQVDGGAQVQQVL 
IIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVT1NKFFQPTEM 
^AQD F FQR WKQLS LPQQBAQK I FKANHPMDAE VTKAKHjGFGSA 
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\=possible nucleotide insertion) 








LLDNVDPNPENFVGAGIIQTKALQVGCLLRLBPNAQAQMYRLTL 
KI oivbt 1 VbKHLiCEJjIiAQQF 


5800 


2679 


1435 


IiLSTyiKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYI, 
TLSSVASTDVIATVLESMPPFPERESSIIiAIGjKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPS1*GPTPEEAFI,SPG PEDIGPP I P 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYIiFYGN 
KTS VQFQNFfl PTWH PGDLQTQLAVQTKR VAAQVDGG AQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFPQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
LLDNVDPNPENFVGAG I IQTKALQVGCLLRLEPNAQAQMYRIiTL 
RTS KEPVSRHLCELLAOQF 


RROi 

D O U JL 


3 


1413 


FPRLYHLI PDGEITS I KINRVDPSESLS IRLVGGSETPLVHI 1 1 
QHI YRDGVI ARDGRLLPGDI ILKVNGMD I SNVPHNYAVRLLRQP 
CQ VLWLTVMREQKFRS RNNG QAPDAYR PRDDS FHVI LNKS S PE E 
QLG I KLVR KVDEPG VF I FNVLDGG VAYRHGQL EENDRVLAINGH 
DLR YGS P ES AAHIi I Q AS ERRVH1*WSRQVRQRSPDI FQEAGWNS 
NGSWSPGPGERSNTPKPLHPTITCHEKWNIQKDPGESLGMTVA 
GGASHREWDLP I YVISVE PGG VI SRDGRI KTGD ILLN VDGVELT 
B VSRSEAVALLKRTSSS I VLKALEVKE YE PQEDCSS PAALDSNH 
NMAPPSDWS PSWVMWLELPRCLYNCKD I VLRRNTAGSIX3FCIVG 
GYEEYNGNKPFFIKSIVEGTPAYNDGRIRCGDrLIiAVNGRSTSG 
MIHACLARLLKELKGRI TLTI VSWPGTFL 




3 


290 


CF S L YQI MERI MDL PTL LRHAFREMPS VGGLF WMFR IRIX LCLM 
GAFFYLI S PliDFVPEALFGILGFLDDFFVIFLLLI Y I S IMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAE I YAYREEQDFGIB I VKVKAIGRQkFKVLEL&TQSD 
GIQQAKVQILPECVIjPSTMSAVQLESLNKCQIFPSKPVSREDQC 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
KDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQIjLKIGSAIQR 
LRCE LDI MNKCTSIjCCKQCQETE I TTKNE IFSLS LCGPMAAYVN 
PHGYVHETIiTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKXDKSPQKFWGLTRSALLPT2PDTEDEISPDKVI 
LCI, • 


5804 




1707 


EME KQRQEEQ RKRT E E ERKRR I EQDMLEKRX I QRE LAKRAEQ I E " " 
D INNTGTES ASEEGDDSI»L ITWP VKS YKTSGKMKKNFE DLE KE 
REEKERIKYEEDKR I RYEEQRPSLKEAKCLSLVMDDEI ESEAKK 
ESLSPGKLKLTFEEIjERQRQENRKKQAEEEARKRI»EEEKRAFEE 
ARRQMVNEDE ENQDTAK I FKG YRPGKLKLS FE EMERQRREDE KR 
KAEEEARRRIEEEKKAFAEARRNMVVDDDSPEMYKTISQEFLTP 
GKLE INFEELLKQKMEEE KRRTEEERKHKLEMEKQEFEQLRQEM 
GEEEEENETFGIiSREYEELIKLKRSGSIOAXNLKSKFEKIGQLS 
BX\^xyiu^±isti^KAKRRAIDLEIKEREAENFllEEDDVDVRPARKS 
EAP FTHKVNMKAR FEQMAKAREEE EQRR I EEQ KLLRMQFEQRE I 
DAALQKKREEEEEEEGS I MNGSTAEDEEQTRSGAPWFKKPLKNT 
avvuafifVKi' i VitvTXstPKPEITWWFEGEILiQDGEDYQYIERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


YISDTLGQVYKSKIRWWIEENGGNGNISVDDliIALLDIiAEHASS" 
AFKES QQQS BDRE YE VKERLYPKS KRR YDT YN IAG YQGE I E VGL 
YTI QI LQLI P FFDNKNELSKRYMVNFVSGSSD IPGDPNNE YKLA 
LKNYI P YLTKLKFSLKKSFDFFDEYFVLLKPRNNI KQNEBAKTR 
RKVAG YFKKYVDI FCLLEESQNNTGLGSKFSB PI*Q VERCRRNLV 
ALKADKFSGLLEYDIKSQEDAISTMKCIVNEYTFLLK 


5806 


12S7 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFIjFACQRFLGFAVFJjLPW 
ASMWLRSLLKPIHVFFGAAILSLSIASVISGINEKLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRP 


5807 


22S7 


1302 


RFS KKT FRR PMAVDIQP ACLGLYCG KTLLFKNGSTE I YG E CGVC 
PRGQRTNAQK^CQPCTESPELYDWLYIXSFMAMLPLVIiHWFFIEW 
YSGKKSSSALFQHITALFECSMAAI I TLLVSDPVGVIjYIRS CRV 
LMLSDWYTMLYNPSPDYVTTVHCTHEAVYPLYTIVFIYYAFCLV 
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Glutamic Acid, F«Phenylalanine, G=Glycine, 
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S»Serine, T=Threonine, V=Valine, 
n-itypwopnan^ i-iyrosme, A=un Known , * =Stop 
Codon, /=possible nucleotide deletion , 
\=possible nucleotide insertion) 








LMMLLR P LiJVKK IACGLGKSDR FKS I YAALYFFP I LT VLQAVGG 

GLLYYAFPYI ILVLSLVTLAVYMSASBI ENCYDLLVRKKRLI VL 

FSHWLLHAYGIISISRVDKLEQDLPLLALVPTPALFYLFTAKFT 
EPSRILSEGANGH 


5808 


2 


433 


SLPDSGVVEYDSNGGVADNHKDFGEbRYWECZiMNFSCNGKNGSS 
EGRITHGFQLKSAYEJSINLMPYTNYTFDFKGVIDYI FYSKTHMNV 

IiGVLGPLDPQWLVEUNITGCPHPHIPSDHFSLLTQLELHPPLLP 
LVNGVHLPNRR 


5809 


4*4 


2422 


ILVPGFQGILHPGVYCALQSQHQAQELVADIDECEVSGLCRKGG 
R CVNTHGS FE CYCMDG YLPRNGP E P FHPTTDATS CTE I DCGTPP 
EVPDGYI IGNYTSSLGSQVR YACREGFFSVPEDTVSS CTGLGTW 
ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCQEGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSLFNDTCVRWQ 
INS RRINP K I S YV I S I KGQRLDPMES VREETVNLTTDS RTPE VC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
S I FNETCLKLNRRSRKVGSEHM YQFTVI^QRWYLANFS HATS FN 
FTTREQVPWCLDLYPTTDYTVNVTLLRSPKRHSVQITIATPPA 
VKQTISNISGFNETCliRWRSIKTADMEEMYLFHIWGQRWYQKEF 
AQEMTFNI S S S SRDPE VCLDLRPGTN YNVS LRALS S E L PWTSL 
TTQITEPPLPEVEFFTVHRGPLPRIxRLRKAKEKNGP I SS YQVLV 
LPLALQS T FS COSEGAS S FFSNAS DADG YVAAELLAKDVP DDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDS S LMLLQMAG VG LGS LA WI ILTFLS FS AV 


5810 


3 


1<541 


KVFGTHKDHE VSTLDTAI S AVKVQLAEFLENLQEKSLRI EAFVS 
B IESFFNTIEENCSKNEKRLEEQNEEMMKKVLAQYDEKAQSFEE 
VKKKK^FIiHEQMVHFLQSMDTAKDTLETIVREAEELDEAVFLT 
SFEEINERIiLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQPPRLE PQE PNSATSTT IAVYWSMNKEDVI DS FQ VYCME 
E PQDDQEVNELVEEYRLTVKES YC I FEDLEPDRCYQVWVMAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 
TETYTLEYCRQHSPEG EG LRS FSG I KGLQLKVNLQPNDNYFFYV 
RAINAFGTSEQSEAALISTRGTRFLLLRETAHPALHISSSGTVI 
S FGERRRLTE I PS VLGEELP SCGQH YWETTVTDCPAYRLG ICSS 
SAVQAGALGQGETSWYMHCSEPQR YTFFYSG I VSDVHVTERPAR 
VGXLLD YNNQRLI FINAE SEQLLF 1 1 RHRFNEG VH PAFALEKPG 
KCTLHLGIBPPDSVRHX 


5811 


1918 


851 


AAAlJuOPIiPEDKWSAEKRRPLKSSLGYEITFSLLNPDPKSHDVY 
WD I EGAVRR YVQ P FLNALGAAGN FS VDSQ I L YYAMLGVNPRFDS 
AS S S YYLDMHSL PHV INP VESRLGS S AASL YP VLNFLL YVP ELA 
HS PL Y I QDKDGAP VATNAFHS PRWGG I M VYNVDS KT YNAS VLPV 
RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WELDRLLWARSVENIiATATTTLTSLAQLLGKISNIVIKDDVASE 
VYKAVAAVQKS AE ELASGHIiAS AF VAS QEAVTS S ELAFFD PSLL 
HLLYFPDDQKFAI YI PLFLPMAVPI LLSLVKIFLETRKSWRKPE 
KTD 


5812 


5204 


2744 

< 

J 
I 


GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASLEKIADPT 
IiAEMGKNLKEAVKMLEDSQRRTEEENGKKLISGDrPGPLQGSGQ 
DMVSILQLVQNLMHGDEDEEPQSPRIQNIGECGHMALLGHSLGA 
YISTLDKEKLRKLTTRILSDTrLWLCRIFRYENGCAYFHBEERE 
GLAKI CRLAIHS R YEDFWDGFNVL YNKKP VI YLS AAAR PGLGQ 
YLCNQLGLPFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDDIERG 
RLPLLL VANAGTAAVGHTDKIGRLKELCEQYG I WLHVEGVNLAT 
LAI/5 YVSSSVIJUUUCCDSMTMTPGPW^ 

LTLVAGLTSNKPTDKLRALPLWLSLQYLGLDGFVERIKHACQLS 
QRLQESLKKVNYIKILVEDELSSPWVFRFFQELPGSDPVFKAV 
P VPNMTP SGVGRERHS CDALNRWLGBQLKQLVPASGLTVMDLEA 
EGTCLRFSPLMTAAVLGTRGEDVDQLVACIESKLPVLCCTLQLR 
SEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
3ENIHAGLLKKLNELESDLTFKIGPBYKSMKSCLYVGMASDNVH 
\AELVET I AATARE I EDNS RLLENMTE WRKGI QEAQVELQKAS 
3ERLLEEGVLRQI P WGS VLNWFSPVQALQKGRTFNLTAGSLES 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid F=Phenvl ai ani na r2—m\rr*im*. 
H>Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine f N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEPIYVYKAQGAGVTLPPTPSGSRTKGRLPGQKPFKRSLRGSDA 
LSKTSS VSHI EDLEKVERIiSSGPEQITLEASSTEGHPGAPS PQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5B13 


1 2935 


^99 


HRDG VSGS LER PLTDRS RTGAFAQQRGKMATAGGGSGAD PGSRG 
LI.RLLSFCVLLAGLCRGNSVERKIYIPLNKTAPCVRLLNATKQI 
GCQSSISGDTGVIHVVEKEEDLQWVliTDGPNPPYMVLLESKHFT 
RDLMEKLKGRTS RI AGLAVS LTKPS PASGFS PS VQCPNDGFGVY 
SNSYGPEFAHCRBIQWNSLGNGLAYEDFSFPIFLLEDENETKVI 
KQC YQDHNLSQNG SAPT F PLCAMQ L FSHMAWLS FSTAT \ CMR RS 
SIQSTFSI NP K I VCDP LSD YNVWSMLKP I NTTGTLKPDDR WVA 
/iiKuuaKir ' wn v v\^vji^SAVASFVTQIAAAEALQKAPDVTTIi 
PRNVMFVPFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEIi 
GQVALRTSLELWMHTDPVSQKNESVRNQVEDLLATLEKSGAGVP 
AVILRRPNQSQPLPPSSLQRFLRARNISGWLADHSGAFHNKYY 
QS I YDTAENINVS YPE WLE PLKE / ET WN FG * QD TAKALADVATV 
LGRALYELAGGTNFSDTVQADPQTVTRLLYG\FLIKANNSWFQS 
I LQGRDLRS YLG * RGL FQH \ YIAV\ S S PTNT I YV/ VLQ YALANL 
TGTVVNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRS TARLARALS PA PELS QWS STE YSTWTES RWKD I RARI FL 

IAS KELELI TLTVGFG ILI FSL I VTYCINAKADVLFIAPREPGA 
VSY 


5814 


j 8500 


432 

: 

3 

I 


ALKCRPRRVLAILVGPVQPDRlVlAEEGAVAVCVRVRPI^SREESIi " 
GBTAQ V YWKTHNNVI YP VDGS KS FN FDRVLHGNETPKNVYE A\ I 
AAPIIDSAIQGYNGTIFAXYGQTXASGKTYTMMGSEDHLGVIPQ 
GQFHGHFSQKI * EVFLDREFLLRVS YMEI YNBTITDLLCGTQKM 
KPLIIREDVNRNVYVADLTEEWYTSEMALKWITKGEKSRHYGE 
TKMNQRSSRSHTIFRMILESREKGEPSNCEGSVKVSHLNLVDLA 
GSERAAQTGAAGVRLKEGCNINRSLFILGQVIKKLSDGQVGGFI 
NYRDSKLTR ILQNSLGGNPKTRI I CTI TP VSFDETLTALQFAST 
AKYM KNTP YVNE VS TDEALLKR YRKE I MD LKKQLEEVSLETRAQ 
AMEKDQLAQLLE E KDLLQKVQNEK I ENLTRML VTS S SLTLQQ3 L 
KAKRKRRVTWCLGKINKMKWSNYADQFN I PTNITTKTHKLS INL 
LREIDESVCSESDVFSNTLDTLSEIBWNPATKLIiNQENIESELN 
S LRADYDNLVLD YE QLRTE KEEMEL KLKEKNDLDE FEALER KTK 
KDQEMQL IHEISNLKNLVKHREVYNQDIiENELSS KVELLREKED 
QIKKLQEYIDSQKLENIKMDLSYSIiESIEDPKQMKQTLFDAETV 
AliDAKRESAFLRSENLEIiKEKMKELATTYKQMENDIQLYQSQLE 
AKKKMQVDLEKELQSAFl^ITICbTSLIDGKVPKDLLCNLELEGK 
I TDLQKELNKEVE ENEALRBE VI LLSELKS LPS EVERLRKE IQD 
KSEELHI ITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 
SNYKS TDQEFQNFKTLHMD FEQKYKMVLEENERMNQE I VNLS KE 
AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 
NRDS PLQTVERE KTL I TEKLQQTLEE VKTLTQEKDDLKQLQES L 
QIERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETriOTLKS 
KISEEVSRNl^EENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 
TADVKDNEI IEQQRKI FSLIQEKNELQQMLES VIAEKEQLKTDI* 
KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 
TCDRLAEVEEKLKEKS QQLQEKQQQLLNVQEEMSEMQKKINE IE 
NLKNELKNKELTLEHMETERLELAQKIiNENYEEVKS ITKERKVL 
KE LQKS FETERDHLRG Y IRE I EATGLQTKEELKI AH I HLKEHQE 

TIDELRRSVSEKTAQIINTQDLEKSHTKLQEEIPVLHEEQELLP 
NVKKVSETQETMNELELLTEQS TTKDSTTLAR I EMERLRLNEKF 

C?ESQEEIKSLTKERDNLKTIKEALEVKHDQLKEHIRETLAKIQE 
S QS KQEQS LNMKEKDN ETTKI VS EM EQFKPKDSALLR I E I EMLG 
LSKRLQESHDEMKSVAKEKDDLQPXQEVLQSESDQLKENIKEIV 
MCHLETEEELKVAHCCLKEQEETINELRVNLSEKETEISTIQKQ 
LEAINDKLQNKIQEIYEKEEQLNIKQISBVQEKVNELKQFKEFR 
KAKDSALQSIESKMLELTNRIjQESQEEIQIMIKEKEEMKRVQEA 
jQ I E RDQL KENTKE I VAKM KE S QE KE YQFLKMTAVNETQEKMC E 
t EHLKEQFE TQKLNLEN I ETEN I RLTQILHENLEEMRS VTKERD 
DIiRSVEETLKVERIXJLKENI^ETITRDLEKQEELKIVHPOTLKEH 
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1 SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

| sequence 


1 Predicted end 

J nucleotide 

1 location 

1 corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A_Alanine, (-^Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H=Histidine, I»Isoleucine, K=L.ysine, | 
La Leucine, M-Meth.ionine, N-Asparagine , \ 
P«Proline, Q=Glutamine, R-Arginine, | 
S -Serine, l>Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) j 








QETIDKLRGIVSBKTNEISNMQKDLBHSNDALKAQDLkf QEELR H 

I AHMHLKEQQET I DKL RGI VSE KTDKLSNMQ KDLENSNAKLQE K 

IQELKANEHQLITLKKDVNBTQKKVSEMEQLKKQIKDQSLTLSK 

LE I ENLNLAQ KLH ENL3EMKS VM KERDNLRRVEETLKLERDQLK 

ESLQETKARDIJSIQQELKTARMLSKEEKETVDKliREKISEKTIQ 

ISDIQKDLDKSKDELQKKIQELQKKELQLLRVKEDVNMSHKKIN 

EMEQIjKKQFE PNY LCKCEMDNFQLTKKLHESLEE IR I VAKERDE 

LRR I KES WCMERDQF I ATLREMIARDRQNHQVKPEKRLLSDGQQ 

HLMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 

RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVEKQKEIiLIK 

IQHLQQDCDVPSRELRDLKLNQNMDLHIEEILKDFSESEFPSIK 

TEFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 

NNFFNNR I IAI MNESTE PEERS ATI S KE WEQDLKSLKE KNEKLF 

KNYQTLKTSLASGAQVNPrTQDNKNPHVTSRATQLTTEKlRELE 

NSLHEAKESAMHKESKIIKMQKELEVTNDIIAKLC2AKVHESNKC 

IiEKTKETIQVLQDKVALGAKPYKEEIEDLKMKLGKIDLEKMKNA 

KEF EKE I SATKATVE YQKEVIRLLRENLRRSQQAQDTSVI SEHT 

D PQ PS NKP ItTCGGGSG I VQNTKALi I L KS EH I RLEKE I S KL KQQN 

EQL I KQKNELLSNNQHLSWE VKTWKERTLKREAHKQ VTCENS PK 

SPKVTGTASKKKOITPSQCKERNLQDPVPKESPKSCFFDSRSKS 

LPSPHP VRYFDNS SLGLCPEVQNAGAESVDS QP\GP WARLFQGK 

DVP\ECKTQ 


5315 


! 23 


1 


S EI>VM WTVQNRES I/3LLS F PVM I TM VCCAHS TNE PSNMS Y VKET 
VDRLLKGYD I RliRPD FGG P P VD VGMR I DVAS IDMVSEVNMDYTL 
TMYFQQSWKDKRIiSYSGI PLNLTIiDNRVADQLWVPDTYFLNDKK 
SFVHGVTVKNRMIRLHPDGTVLYGLRITTTAACMMDLRRYPLDE 
CNCTLEIESYGYTTDDIEFYWNGGEGAVTGVNKIELPQFSIVDY 1 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
_ WVI>r Wil^DASAARVALGITTVLTMTTISTHLRETLPKIPYVK 1 
AID I YliMGClTVFVFLALLE YAFVNYI FFGKGPQKKGAS KQDQSA 
NEKNKLEMNKVQVDAHGNI LLSTLEI RNETSGSEVLTS VSDPKA 
TMYS YDSAS IQYRKPLSSRE \A*GRAPDRHG VPS KGR I RRRAS\ J 
QLKVK I PDLTDVNS I D KWSRMFFP I T FSLFNWYWL YY VH ( 


5816 "f 


861 | 


1S1 


TSSRSRAAAQEGDAETPGS VERRGRRAGAEDGMSQAPGAQPS PP | 
_ vx«_KUKJj_iJjtAVHAIjNNVI*QQQ j 
NPHRSLL^TGITYDVNVIMAAI-K.GLG 

VUSLILNLPSPVSIiGLLSLPLRRRHLRWPC^Ij/VTVSY^NLDS 
K\LRAPEGPGGLRTE\ *GPFIiAAAIiAO^IjC_rVIiLVVTKEVEE 
SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGA_,PKRHCPC*PPSPPAAD 
VMSNTTVPNAPQANSDSM VG Y VLGP FF1» I T1»VGVVVAWM YVQK 
KKRVDRLRHHLLPMYSYDPAEELHEAEQELI*SDMGDPKW\QAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLIiPLLS P 
GSPCWVIK3L«FSLHPPSAASASHA_,TITSLPPGLLPinA3VE_,TA 
HPQALMGRGF PS GMAAAGRH IjCFIj 


5818 


3 


3318 

: 
i 

3 


QAL R DKL W I FLVQS F YA VRHTES WKLMS TDDQQKI QAAAFDKGD I 
DRRLGKKP r FSS SQQRKQVSDSGDIKI KS WRGNNKKECWSYLST 
NKKM KS DGLG AS G HS S S TNRNS I NKTLKQDDVKEKDGTKI ASK I 
TKELKTGGKNVSGKPKTVTKSKTENGDKARI.ENMSPRQVVERSA 
TAAAAATGQKNLLNGKG VRNQEGQ I SGARP KVL TGNLNVQAKAK 
PLKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKMS VDS VKNSTVAI KSRPVSR VT 
NGTSNKKSIHEQDTNVNNSVLKKVSGKGCSEPVPQAILKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSVKSSVSSRQSDENV 
AKLDHNTTTEKQAPKRKMVKQVHTALPKVNAKIVAMP1<NLNQSK 
KGE TLNNKDS KQ KM P PGQVI S KTQ PS SQR PLXHETS T VQKSMFH 
DVRDNNNKDS VSEQKPHKPLINLASEIS DAEAIjQSS CRP\DPQK 
PIiNEOEKEKIiALECQNlSKLDKSIiKHE_.ESKQICLDKSETKFPN 
HKE TDDCDAANI CCHS VGSDNVNS KF YS TTALKYMVSNPNENS I* 
VSNP VCDLDS TS AG Q I HLI S DRENTQVGR KDTNXQSS I KC VSDVS 1 
-CNPERTNG TLNSAQEDKKSKVPVEGLTI PS KX.SDBSAMDEDKH j 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
J residue of 
j amino acid 
sequence 


Amino acid segment containing signal peptide" 
<A»Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=*Glycine, 
H=Histidine, Itrlsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P~Proline, Q=Glutamine, R=Arginine, 
S-Serine, ^Threonine, V=Valine. 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNSPKNMETSESPESHETPETPFVGH 
WNLSTGVLHQRBSPESDTGSATTS£3DDIKPRSEDYDAGGSQDDD 
GSNDRG I S KCGTMLCHDFLGRSS SDTSTPEELKI YDSNIjRI B VK 
M KXQS SND L FQ VNSTS DDE I PR KR P EI WS RS AI VHS REREN I PR 
GS VQFAQE IDQVSSS ADETEDERS EAENVAENFS I SNP APQQPQ 
GIINLAFEDATENECREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGNSVCKNESTVLDLSS I DS S RKNKQS VSATE KKNT I DVL 
SSRSRQLLREDKKVNNGSNVENDI QQRSKFlrDSDVKS QERPCHL 
DLHQRE PNS D I P KNS S T KS LD S FRS QVL PQEGP VKESHS TTTE K 
ANIALSAGDIDDCDTLAQTRMYDHRPSKTLSPIYEMDVIEAFEQ 
KVESETHVTDMDF+DDQHFAKQDWTLLKQLLSEQDSNLDVTNSV 
PEDLSLAQYLINQTLLIARDSSKPQGITHIDTLNRWSELTSPLD 
S S AS I TMAS FS S EDCS P QGE WT X LELETQH 


5813 


1 


5557 

; 


AAAGLUSAUHiiVMTLWAAARAEKEAFVQSESIIEVLRFDDGGL 

LQTETTLGLSSYQQKSISLYRGNCRPIRFEPPMLDFHEQPVGMP 

KME KVYLHNPSS E * T I TIiVS I FATTS HFHAS FFQNRKI L PGGNT 

S F D VS / VFLARWGNVENTLF INTSNHGVFTY \QVFGVGVPNP Y 

RLRPFIiGARVTVMSSFSPIINIHNPHSEPLQVVEMYSSGGDLHI* 

BLPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 

TNASDSTEFIILPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 

LN LHLLNSGT KDVP I TS VR PTPQ \ NDAI TVHFKP I TLKAS \ES K 

YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKLEIPYQAEV 

LDGYLGFDHAATLFHI RDS PADPVERP I YLTNTFSFAIIiIHDVL 

LPEEAKTMFKVHNFSKPVLILPNESGYIFTLLFMPSTSSMHIDN 

NILLITNASKPHLPVRVYTGFLDYFVLPPKIEERFIDFGVLSAT 

EASNILFAIINSNPIELAIKSWHIIGDG\LSIELVAVDRGNRTT 

IISSLPECEKSSSSDQSSVTLASGYF\AVFRVKI*TAKKJL\EGIH 

DGAIQITTDYEILTIPVK\AVIAVGSLTCSPKHWLPPSFPGKI 

VHQSLN I MN S F S QKVK I QQ I RSLS EDVR FYYKRLRGNKEDLE PG 

KKSKIANIYFDPGLQCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 
WDADWDLHQSLFKGWTGI KENSGHRLSAIFEVNTDLQKNI I SKI 
TABLSWPSILSSPRHLKFPLTNTNCSS \EEEITLENP /SQDVPV 
YVQFI PIALYSNPS VFVDKLVSRFNLSKVAKIDLRTLEFQVFRN 
SAHPLQSSTGFMEG\LSPHLILNLIIiKPGEKKSVKVK\FTPVHN 
RTVSSLI I VRKNLTVMDAVMVQGQGTTENLRVAGKLPGPGSSLR 
FKITEALLKDCTDSLKLREPNFTLKRTFKVERTGQLQIHIETIE 
ISGYSCEGYGFKWNCQEFTLSANASRDIIILFTPDFTASRVIR 
ELKFITTSGSEFVFILNASLPYHMIATCAEALPRPNWELALYII 
I SG I MS AL FLLVTGTA\ YLEAQGI WE P\ FRRRLS \ FEASNPPFD 
VGRPFDLRRIVGISSEGNLNTLSCDPGHSRGFCGAGGSSSRPSA 
GSHKQ * GPSGHPHSSHSNRNSADVDDVRAYNSGRTSSMTSAQAA 
SSQPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 
PLEQH PQ P PLPP P VPQ PQEPQ P ERLS PAPIiAHPSHPERASSARH 
S SEDS D I TS LI EAMDXDFDHHD S PALE VFTEQPPS PLP KS KG KG 
KPLCRKVKP PKKQE EKEKKGKGKPQEDE LKDSLADDDS S STTTE 
TSNPDTEPLLKEDTEKQKGKQAMPEKKESEMSQVKQKSKKLLNI 
KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK 
SRNAQKTKGTSKLVDNRPPAIiAKFLPKSQELGNTSSSEGEKDSP 
PPEWDS VPVHKPGSSTDSL YKLSIjQTLNAD IFLKQRQTS PTPAS 

PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 
S Li PGKNGN PTFAAVTAG YDK 9 T>nriMr , Pa irv/c c\nr*vr> c*cr «-»-«■»■» 

HAPVDSDGSDSSGLWSPVSNPSSPDFTPLNSFSAFGNSFNLTGE 
VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 
5GSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 
rLASIGLMGTENSPAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 
fcSSDPWSNSHFPHEN 


5820 


310 


1270 I 
I 


* VS LSGP VS LG VLLCARSSTMGKR DNRVAYMNP I AMARS RGP I Q 
3SGPTIQ\ VI * IDQGLPGKK*KSN*KRKRK/DSKALAEFEEKMN 
=3WKKELEKHREKLLSGSESSSKKRQRKKKEKKKSW*\DSSSS\ 
JSSSDSSSS£3SDSEDED!OCQGKRRKKKKNRSHKSSESSMSETES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresoonding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
r.=HiSLiaine, ioisoieucine, K=Lysa.ne, 
L=Leucine, M=Methionine, N^Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W^Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ impossible nucleotide insertion] 








DS KDSLKKKKKS KDGTEKE KD I KGLS KKR KMYS EDKPLSS ESL S 
ESEYIEEVRAKKKKSSEEREKATEKTK2CKKKHKKHSKKKKKKAA 
SSSPDSP*H*EKSGFPYKESAMSEEISTVKTTTYLLKCMNFLVF 
GI I PGLFSSHSDATV 


5B21 


179 


915 


KWRNQ5 WRW PKP GTN WMI>S CS VCWRRVTWTG 5 VWMRKLG KHP QT 
PT/IKD CS I AATGKRPSARPPHQRRKKRREMDDGLAEGGPQRSN 
TYVI KL FDR S VDLAQ FS ENTPL YP I CRAWMRNS P S VRERE CS PS 
SPLPPLPEDEEG\SEVTNSKSR+CVQACPPTHTPGGQPKNACR\ 
SRI PS PLAALRMQGT P * RW S P FE PE P S PSTL I YRNMQRWKR I RQ 
RWKEASHRNQLR YSESMKI LREMYERQ 


5822 


464 


4379 


QTLKEMPIVMARDLEETASSSEDKEVI SQEDHPCIMWTGGCRRI " 
PVLVFHADA 1 LTKDNNI R V I G ER YHLS YK I VRTDS RLVRS I LTA 
HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 
LTRKDRLYKNI I RMQHTHG FKAFH I LPQTF LLPAE YAE F CNS YS 
KDRGPWIVKPVASSRGRG\VYLINNPNQISLEENILVSRYINNP 
LL IDDFKFDVRIjYVLVTS YDPLVI YLYEEGLARFATVRYDQGAK 
NI RNQFMHLTN YSVNKKSGDY VS CDDP3VEDYGNKWSMSAMLRY 
LKQEGRDTTALMAHVEDL 1 1 KTI ISAELAIATACKTFVPHRS S C 
FEL YG FDVL I DS TL KPW L L E VN L S PS LACDAP LDL KI KAS M I S D 
MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
SDABM KNLVGSAREKGPGKLGGSVLGLSMEEI KVLRRVKEENDR 
RGGFI RI FPTS ETWEI YGS YLEHKTSMNYMLATRLFQDRMTADG 
APELKI*SLNSKAKLHAALYERKliI,SLEVRKRRRRSSRI 1 RAMRP 
KYPVITQPAEMNVKTETESEEEEEVALDNEDEEQEASQEESAGF 
LRENQAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETQE 
LEPKFNLMQILQDNGNLSKMQARIAFSAYLQHVQI\RLMKDSGG 
QTFS AS WAAKEDEQMEL WRFLKRASNNLQHSLRMVIj PSRRLAL 
LERTRILAHQLGDFI I V YNKETEQMAEKKSKKKVEEEEEDGVNM 
E^FQE FI RQASEAELEEVLTFYTQKNKSAS VFLGTHSKIS KNWN 
NYSDSGAKGDHPET I MEE VKI KPPKQQQTTE IHS D KLSRFTTS A 
E KEAKLVYSNS S S G PTATLQK I PNTHLS S VTTS DLS PGP CHHS S 
LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGLP 
RCRSGSHT IGP FS S FQSAAH I YS QKLSRPS S AKAGSC YLNKHHS 
G I AKTQKEGEDASLYSKRYNQSM VTAELQRLAE KQAARQ YS PSS 
HINLLTQQVnJLNLATGI INRSSASAP PTLRPI I S PSGPTWS TQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
IiPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VPKPPPNHEQVLRRATSQKASKGSSAEGOLNGLQSSLNPAAFVP 
ITSSTDPAHTKIMNHKHTEKQPVHHSWVHD 


5823 
' 5824 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDIIiLLADEKFDF 
DLS LSS S SANEDDEVFFGPFGHKERCIAASLE LNN P VPEQP PL P 
TSES P F AWS PLAGE KF VEVYKEAHLLALH I ES SS RNQAAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLJLG P P VGEPRLLAS S PALPS SGAQARLTRAPG P PHS AHALP 
RESCTAHAASQAATQRKPGTKLLLPRAAS VRGRGI PGAAEKPKK 
E I PAS PS R TKI PAE KES HRD VLP D KP APGA VNVPAAGS HLGQG K 
RAIPVP\NKLGLKKTLLKAPGSYSN\IiQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFASI PAN*LPGLCPNI SKS \GRMGPAMLRPA 
L\ PAGPVG \AS SWQAKR VDVS ELAAEQLTAPP\SAS PTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFS IGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\ CVPARRRSSEPRKNSAMRTE PTRESNRKTDSR\ LVDVS PDR 
GS PPSRVPQALNFS PEES DSTFS KSTATEVARE EAKPGGDAAPS 
E ALLVD I KLE PLAVTPDAASQ PL I D L PL I DFCDTPEAHVAVGSE 

SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPI.IQT.SPEADK 
ENVDSPLLKF 




42 


2293 

) 

r 


LLTALSMEGGGGRDBPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
rS ES PFAWS PliAGEKFVEVYKEAHLLALH I ES SSRNQAAQAAKP 



380 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, OCysteine, D=Aspartic Acid, E*» 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L^Leueine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R«=Arginine, 
SoSerine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDP RSQG VERF IQESKF\KI NL FEKE KEM KKS PTS LKRET Y Y LS 
DS PLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 

RArPVP\NlCt.GT»1<CVTTiT.TfZlDnQV<!Kr\ t rtDtrcc<cn>i V ,„,-,-, „ - 

itr v c \««vuv7Jjjvi^j.uur^fvjo * »N\LQKKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS I PAN* LPGLCPNISKS\GRMGPAMLRPA 
L\ PAGP VG\ ASS WQAKRVDVS ELAAEQLTAP P \SAS PTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 

P1j\CVPARRRS seprknsamrte ptresnrktdsr\ lvdvs pdr 

GS PPSRVPQALNFS PEESDSTFSKSTATE VAREEAKPGGDAAPS 

BALLVDIKLEPLAVTPDAASQPLIDLPIiIDFCDTPEAHVAVGSE 

SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5825 


2 


4210" 


FLQIESASPAPFSSGFLAAHPHSPGGSLATKGRSRLSAPGMLHIi 
S AAP PAP P P E VTATARPCLCS VGRRGDGG KMAAAGAIjE R SFVEL 
SGAERERPRHFREFTVCS I GTANA VAQLA VfCYS E S AGG F Y YVE S G 
KLFSVTRNRFIHWKTSGDTLELMEES LD I NLLNNAI RL KFQN CS 
VLPGGVYVSETQNRVIILMLTNQTVHRLLLPHPSRMYRSELVVD 
SQMQSIFTDlGKVDFTDPCNYQIiIPAVPGISPNSTASTAWLSSD 
GEALFALPCASGGIFVLKLPPYDIPGMVSVVELKQSSVMQRLLT 
GWMPTAIRGDQSPSDRPLSLAVHCVEHDAFIFALCQDHKLRMWS 
YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 
GIF\MHAPKRGQFCIFQLVSTBSNRYSLDHISSLFTSQETLIDF 
ALTSTDIWALWHDAENQTVVKYINFEHNVAGQWNPVFMQPLPEE 
EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
DMV/SEtiKKEVTLAVENELQGSVTEYEFSQEEFRNLQQEFWCKF 
YACCLQYQEALSHPLALHLNPHTNMVCLLKKGYLSFLI PS s LVD 
HLYLLPYENLLTEDETTISDDVDIARDVICLIKCLRLIEESVTV 
DMS V I MEMS CYNLQS PEKAAEQ I LEDMI T I D VENVMED I CS KLQ 
EIRNPXHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 
GSNTAG YI VCRGVH KI ASTR FLI CRDLLI LQQLLMRLGDAV I WG 
TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LQHLSVLELTDSGALMANRFVSSPQTIVELFFQEVARKHIISHIi 
FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 
vyijyi^iiyjji^pwcaVNVGSCRFMljGRCYLVTGEGQKAL 
ECFCQAASEVGKEEFLDRLIRSEDGEIVSTPRLQYYDKVIiRLLD 
VIGLPEL VIQLATSAI TEASDDW \KS QATIi\ RTCIFKHHL\ DLG 
\HNSQAYGSL*PQIPDSSRQLDCLRQLVWLCERSQLQDLVEFS 
YVWLHNE WGI IESRARAVDLMTHNYYELLYAFHI YRHNYRKAG 
TVMFEYGMRMREVRTLRGLBKO^CYLAALNCLRLIRPEYAWI 

VQPVSGAVYDRPGASPKRNHDGECTAAPTNRQIEILELEDLEKE 
CSLARIRLTLAQHDPSAVAVAGSS9AFFMVTT T.vnan i?rvp»To 

LCQTFKLPLTP VFEGLA FKC I KLQ FGGEAAQAEAWAWLAANQLS 

S VI TTKES SATDEAWRLLS T YLERYKVQNNL YHHC VINKLLSHG 

VPLPNWL INS YKKVDAAELLRL YLNYDLLDLTP YQVIRI cgc 


5826 
5827" 


3 


871 


KbQLLRDHSAPPPKPCTSVGAMGC*PRQ/SPKEQQRQLKKQKNR 
AAAQRS RQKHTDKADALHQQHES LE KDNLALRKE I QS LQAELAW 

WSRTLHWERiCPMDCASCSAPGLLGCWDQAEGLLGPGPQGQHG " 
CREQLEL FQTPGS C Y PAQP LS PGPQPHDS PSLLQCPLPS LS LGP 
A WAE P P VQLS PS PLL FASHTGS S LQGSS SKLS ALQPS LTAQTA 

PPQPLBLEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWD P S PHPLLAFP LLSSAQVHF 




194 


2287 < 
] 
1 
i 
I 
5 

1 1 


^atavSALKSYTIiP^PPFTLPSGLAVYPAVLQDGKFASVFVYK 
^ ENEDKVNKAAKVP * * HLKTLRHPCLLRFLSCTVE ADG I HL VTE 
WQPLEVALETLSS AE VCAG I YD I LLAL I FLHDRGHLTHNN VCL 
3SVFVSBDGHWKLGGMETVCKVSQATPEFLRSIQSIRDPASIPP 
3EMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
5FQQTLHSTLLNPIPKWRPALCTLLSHDFFRNDFLEWNFLKSL 
?hKSEEE KTE FFKFLLDR VS CLS EEL IAS RLVPLLLNQLVFAE P 
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to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end I Amino acid segment containing signal peptide 
nucleotide (A=Alanine, OCysteine, D^Aspartic Acid, E= 

location Glutamic Acid, F* Phenylalanine, G=Glycine, 

corresponding H*Histidine, I=Isoleucine, K»Lysine, 
to first L=Leucine r M=Methionine, N=Asparagine, 

amino acid P=Proline, Q=Glutamine, R=Arginine, 

residue of S«»Serine, T«Threonine, V~Valine, 

amino acid W-Tryptophan, Y»Tyrosine, X^Unknown, *=Stop 

sequence | Codon, /impossible nucleotide deletion, 
\=possibIe nucleotide insertion) 



-582F 



VAVVKSFLPYLLGPKKDHAQGETPCLLSPALFQSRVIPVtiOLF 
EVHEEHVRMVLLSHIEAYVGAI^SLREQLKKVMLNPQVLLGNLR 
D \ TS DS I VAITLHSLAVLVS LLGPE WVGG ERTK I FKRTAP \ S F 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCTTLDV 
SES S WDDCE PS SLDTKVNPGGG I TATKP VTSGEQKP I PALLS LT 
BESMPWKS SLPQKI SLVQRGDDADQIE PPKVSSQERPLKVPS BL 
GLGEEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 

EMVPKKDDVSPVMQFSSKFAAAEITEGEAEGWEEEGELNWEDNN 
W 



2 57 ~ | AR EGGSLGAVAACG ELS YSCD FC PARPHTS WLTRF VKM E FQAW 

MAVGGGSRMTDLTSS I P KPLL P VGNKF L I W YPLNLLER VGFEEV 
I WTTRD VQ KALCAE FKMKMK PD I VCI PDDADMGTADS LR Y I YP 
KLKTDVLVLSCDLITDVAIiHEWDLFRAYDASLAMLHRKGQDSI 
E P VPGQKG KKKAVEQRD F IG VDS TGKR LL FMANEADLDEE LVI K 
GS ILQKHPRIRFHTGLVDAHLYCLKKY I VDFLMENG\S ITS IRS 
EL \ I PYLV/ RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
S FY* KEAN YTGTGAP Y\D \ ACWI 



"260" 



1259 



4496 



3139 



PDGRLIVSCSEDKTIKIWDTTNKQCVNNFSDSVGFANFVDFNPS" 
GTCIASAGSDQTVKVWDVRVNICLLQHYQVHSGGVNCISFHPSGN 
YL I TASSDGTL K I LDLL KGRL I YTLQ GHTG P VFT VS FSKGGELF 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRIjHFDS PPHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRL IQSLR* S I CRSLLPLLWISF 
LLILPQQQKPWGLCQTRVKRPVDIS*TLP*CHQNVCQQPRKRK 
| CKr* VTSPVKVK/ VS IPLAVTDALEHIMEQLNVLTQTVS I LEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 



"583T 



71 



2897 



GGKMAAPEERDLTQEQTE KLLQ FQDLTG I ES MDQCRHTLEQHNW 
NIEAAVQDRLNEQEGVPSVFNPPPSRPLQVNTADHRIYSYWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGD I VS FMHS FEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
LVYLHGDDHQDS DEFCRNTLCAPE VI SLINTRMLFWACS TNKPE 
GYR VSQALRENT Y P FLAM I MLKDRRE * P V\ VGRLEGL I \QPDDL 
INQLTF I MDANQT YL VS E R LERE ERNQTQVLRQQQDEAYLAS LR 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVTHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLSHTEVLFVQDLTDE 



"245T 



829 



FCSKD KCCL YL PDS INRS KS CTAKPG AHSQDRHAVMD^ B feQVKD 
TDD I E S P KRS I RDS GY I D CWDS ERSD S LSP PRHGRDDS FDS LDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDAESTSMFDMRC5E 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDLI KKEEERKKMEKLLAGEDGTS ERRKSI KTYRE I VQEKERRE 
RELHE AY KNARS QE EAEG I LQQY I ERFTI SEA VLE RLEMP K I LE 
RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKFTATVETTIARAS 
VLDTSMS AGSGS P S KTVTPKAVPMLTPK P YSQP KNS QDVL KTF K 
VDGKVSVNGETVHREEEKERECPTVAPAHSLTKSQMFBGVARVH 
GSPLELKQDNGS I E INI KKPNS VPQELAATTEKTEPNSQBDKND 
GGKSRKGNIELASS EPQHFTTTVTRCS PTVAFVE FPS SPQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
KMPEANQLHLPNLNS Q VDS PS S E KS PVTTPFK F WAWDPEE ERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YEEEP * 1 1 \ ED PWPFTVS SS SADQLS TSSSMTEGS GTMNKI DL 
GNCQDEKQDRRWKKSFQGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNPVSKGVHEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSSEDVKPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKG AAM 1 1 ETLNL YFH I Q CFRCG \ I CKGQLGDAVSGTDVRI R 
NGLLNCNDCYMRSRSAGQPTTL 



PGRRFRHGSCAFQKQCIMLHI CQ YFLQGE CKFGTS C KRS UDFSN 
SBNLBKLEKLGMS S DL VS R LPT I YRNAHD I KNKSS APSRVPPLF 



382 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
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location 
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residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, CsCysteine, D^Aspartic Acid, B= 
Glutamic Acid, Fa Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L»Leueine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glut amine, Rs=Arginine, 
S -Serine, T^Threonine, V= Valine, 
W»Tryptophan, Y-Tyrosine, X«Un)cnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RVHFHLPYRWQb'LDRGKWEDIiDNMBLIBEAYCNPKIERILCSES 
ASTPHSHCLNFNAMTYGATQARRLSTASSVTKPPHPILTTDWIW 
YWSDEFGSWQEYGRQGTVHPVTTVSSSDVEKAYLAY/WYTGV*R 
PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN* KGLPQTQI R \AP 
QDVTTMQTCNTKFPGPKSIPDYWDSSALPDPGFQKITLSSSSEE 
YQKVWNIiFNRTLPFYFVQKIERVQNIALWEVYQWQKGQMQKQNG 
GKAVDERQL FHGTSA I FVDAI CQQNFDWRVCGVHGTS YGKGS YF 
ARDAA YSHH YS KS DTQTHTMFLAR VL VGE F VRGNASFVRP P AKE 
GWSKAFYDSCVKSVSDPSI FVI FEKHQVYPEYVIQYTTSSKPSV 
TPSILLALGSLFSSRQ 


5833 


170 


3289 


SILCLLSPCWQFQKPWSILSSRSRHSPCTKKGWEGMRKHIjHT 
RQGHK* VHVE ISKALWVYRDDYF1 RHS IS VS AVI VRAWITHKYR 
GRDWNVKWEENIiLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
G Y I WNLRANR I PQC PI*END WALLGF P YAS SGENTG I VKK F PR F 
RNRELEATRRQRMDYPVFTVSLWLiYLIjHYCKANLCGILYFVDSN 
EM YGT P S VFLTEEG YLH I QMHLVKGEDIAVKTKF 1 1 PLKE W FRL 
DI S FNGGQ I WTT S IGQDLKS YHNQTI S FREDFHYNDTAGYFI I 
GGSR YVAG I EG FFG PI»K Y YRLR SLHPAQ I FNPLLEKQLAEQ I KD 
YYERCAEVQEIVSVYASAAKHGGERQEACHLHlSrSYIiDLQRRYGR 
PSMCRAFPWEKELKDKHPSLFQALLEMDLLTVPRNQNESVSEIG 
GKI FEKAVKRLSS IDGIjHQIS S I VPFLTDSSCCG YHKAS YYLAV 
FYETGLNVPRDQLQGMLYSIiVGGQGSERLSSMNIiGYKHYQGIDN 
YPLtDWEIiSYAYYSNTATKTPLDQHTIiQGDQAYVETIRT.KDDE 1 1* 
KVQTKEDGDVFMWLKHEATRGNAAAQQRLAQMLFWGQQGVAKNP 
EAAIEWYAKGALETEDPALIYDYAIVLFKGQGVKKNRRLALELM 
KKAAS KGLEQAVNGLG WY YHKFKKNYA\ KAAKYWLKA\ EE \ MGN 
PDASYNLGVJUHLDGIFPGVPGRNQTLAGEYFHKAAQGGHMEGTJb 
WCSLYYITGNLETFPRDPEKAVVWAKHVAEKNGYLGHVIRKGLiN 
AYLEG S WHEALL YYVIiAAETGIEVS QTNLAHI CBERPDIoARRYTi 
GVNCVWRYYNFSVFQIDAPSFAYIiBCMGDLYYYGHQNQSQDLEIjS 
VQMYAQAALDGDSQGFFNLAIiLIEEGTIIPHHILDFLEIDSTLH 
SNNIS I LQELYERCWSHSNEESFS PCSLAWLYLHLRLLWGAILH 
SAL I YFIiGTFLLS I LIAWTVQYFQS VSASDPPPRPSQASPDTAT 
STASPAVTPAADASDQDQPTVTNNPEPRG 


5934 


17 


4020 

] 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/QECG ~ 
SAAPGPI PGQS SS * VPIiRLEQIQ QKADCPLS LEIiALKPRMAAQ V 
TLB DALS NVDLLE E LPI» PDQQPC I E P PPSS LLYQPNFNTNFEDR 
NAFVTG I ARYI EQATVHSSMNEMLEEGQEYAVMLYTWRSCSRAI 
PQVKCNEQPNRVEI YEKTVEVLBPEVTKIiMNFMYFQRNAI ERFC 
GEVRRLCHAERRKDFVSEAYLITJ^GKFINMFAVLDELKNMKCSV 
KNDHSAYKRAAQFLRKMADPQS IQESQNLSMFLANHNKI TQSLQ 
QQLBVISGYEELLADI\^I > CVDYTENRMYLTFSEKHMLLKVMGF 
GLYLMDGSV5NIYKLDAKKR INIiSKIDKYFKQLQWPLFGDMQ I 
ELARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIRBDHM 
RFISEIiARYSNSEVVTGSGRQEAQKTDAEYRKLFDLAtiQGLQLIf 
SQMSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVEVIAMI KGLQVLMGRMES VFNHAIRHTVYAALQDFSQ 
VTLME PLRQA I KKKKNV I QS VLQA I RKTVCDWETGHE PFNDPAli 
RGEKDPKSG*D3KVPRRAVGPSSTQLYI4VRTMLESLIADKSGSK 
KTJjRSSLEGPTI LD I EKFHRES FF YTHL INFSETLQQCCDLSQIj 
WFRE FFLE LTMGR R IQFP I EMS MPW I LTDH I LETKEAS MME YVL 
YSLDLYNDSAHYAIiTRFNKQFLYDEIEAEVNriCFDQFVYKIiADQ 
I FAY YKVMAGS LLLDKRLRSECKNQGATlHIiPPSNR YETLLKQR 
HVQLLGRS I DLNRL I TQRVSAAM YKSLELAIGRFES EDLTS I VE 
LDGLLEINRMTHKLIjSRYLTIJX5FDAMFREANHNVSAPYGRITIj 
HVFWELNYDFIiPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 
QYLHGSKALNLAYSSIYGSYRNFVGPPHFQVICRLLGYQGIAVV 
MEBLLKWKSLLQGTI LQYVKTIiME VMPKI CRLPRHEYGSPGI L 
EFFHHQXjKDIVEYAELKTVCFQNJjREVGNAILFCLLIEQSLSLE 
£ VCDLLHAAPFQN I L PRVH VKEGERLDAKMKRIjES KYAPIiHLVP 
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Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H-Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M— Methionine, N»Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=*Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIERLGTPQQIAIAREGDLLTKEfeLCCGLSMPEVILTRlRSFLD 
DPIWRGPLPSNGVMHVDECVEFHRLWSAMQFVYCI PV3THEFTV 
EQCPGDGLHWAGCMIIVLLGQQRRFAVLDFCYHLLKVQKHDGKD 
EI IKNVPLKKMVERIRKFQILNDEI IT ILDKVLKSGDGEGTPVE 
HVRCFQPPIHQSLASS 


j 5835 


4209 


1904 


SGNIRMAQGSHQIDFQVLHDLRQKFPEVPEVWSRCMU3KNNNL 
DACCAVLS QES TR YL YGEGD LiNFSDD SGI SGLRNHMTSLICLDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSS S SGASNSAPHLGFHLGS KGTSSLSQQT 
PRFNP IMVTLAPNIQTGRNTPTSLHIHGVPPPVLNSPQGKS I YI 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQS S AHS Q YN 1 QN 1 5 TGPRKNQI E I KLE P PQRNNS S KLRS SGPR 
TSSTSSSVNSQTLNRNQPTVYIAASPPNTDELMSRSQPKVYISA 
NAATGDEQVMRNQPTLFISTNSGASAASRNMSGQVSMGPAPIHH 
HPPKSRAIGNNSArSPRWVTQPNT\EYTFKITVSPNKPPAVSP 
G WS PT FE LTNLLNHPDH Y VETEN I HHLTDPT LAHVDR IS ETRK 
LSMGSDDAAYTQD I *RISNS WLGM VAHACNSSALGGQDGR 1 1 +A 
QEFETSWGNIWRLRLYRRF*NYAGMVAHTCSPSYSVD*ALIjVHQ 
KARKERJLQRELB IQKKKIiDKLKSEVNEMENNLTRRRLXRSNS IS 
QI PS LE EM QQLRS CNRQLQ I D I DCLTKE I DL FQARG PHFNPS AI 
HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 
CTFLNHPALI RCEQCEMPRHP 


*836 


361 


2303 


FrflTMCGt^CSVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK " 
SDVNYQCLFSAHVLHJCiRGVLTTQPVEDERGNVFLWNGEIFSGIK 
V2AEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
HYLWFGRDFFGRRSLLWHFSNLGKSFCLSSVGTQTSGIiANQWQE 
VPAS \DFSE LILSLIiS FPDALFYNCILGNI FLGRI LLKKML IA* 
VXFQQTYQHLYQR* QMKPNCI LKNLLFL * I * CCHKLHWRLI AVI 
FPMCHLQERYFKS FL LMYT* KE VIQQFIDVLSVAVKKR VLCLPR 
DENLTANEVT^KTCDRKANVAILFSGGIDSMVIATLADRHIPLDE 
P IDULNVAFIAEEKTMPTTFNREGNKQ KNKCEI P S EE FS KDVAA 
AAADS PNKHVSVPDR ITGRAGLKELQAVS PSRIWNFVEINVSME 
E LQKLRRTR1 CHIiI R PLDTVLDDS IGCAVWFASRG IGWLVAQEG 
VKS YQSNAKWliTG I GADEQLAG YSRHRVRFQSHGLEGIiNKE I M 
MELGR I S S RNLGRDDRVIGDHGKEARF P FLDENWS FLN5L P I W 
EKANLTLPRGIGEKLLLRLAAVELGLTASALLPKRAMQFGSR I A 
KMEKINEKASDKCGRLQIMSLENLSIBKETKI, 


5837 


4792 


903 

< 


wgnavaqapvti^ccyiatg^kdOtiriwscsrgrgvmilklpfl 
krrggg i dptvkerlwltlhwpsnqptoi»vss cfggellqwdlt 

QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGSLAIGVGDG 
M I R VWNTL S I KNNYD VKNFWQGVKS KVTALCWHPTKEGCLAFGT 
DDGKVGIiYDTYSNKPPQISSTYTnCKTVYTLAWGPPVPPMSLGGE 
GDRPSLALYSCGGEGI VLQHN PWKLSGEAFDI^KLI RDTNS IKY 
KLPVHTEI S WKAIX3K IMALGNEDGSIE I FQ\ I PNLKLICTIQQH 
HKLVNTIS WHHE \HGSPAQKLS YI*\MPSGSQQCSPFTCHWLKKC 
P * KAA PE S PSDPLQS P YRTPPQGHT AQD YP VWAWEPH IH * WEGL 
VFCFP IDG YS PGCW D \ AFPG KEAP VAI FRG \HQGRLI*CVAWSPL 
DPDCIYSG\ADDFCVHKWLTSMQDHSRPPQGKKSIBLEKKRLSQ 
PKAKPKKKKKPTLRTP VKLES I DGNEEESMKEMSGPVENGVSDQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
ILLKKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKEELHQDCL 
VLATAKHS R E LNED VSADVE ERFHLGLFTDRATLYRM I D I EGKG 
HLENGHPELFHQLMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWLWAVEAFAKQLCFQDQYVKAASHLLSIHKVYBAVEIiLKS 
NHFYREAIAIAKARLRPSDPVLKDLYIiSWGTVLERDGHYAVAAK 
CYLGATCAYDAAKVIiAKKGDAASLRTAAELAAIVGEDEliSASIA 
LRCAQELLLANNWVGAQEAIjQIjHESIiQGQRLVFCLLEIiIjSRHLE 
BKQLSEGKSSSS YHTWNTGTEGPFVERVTAVWKS I FSUDTPEQ Y 
2EAFQKLQNI FCYPSATNNTPAKQIiU^HI CHDLTIAVLSQQMASW 
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Amino acid segment containing signal pept±<5e~"" 
(AsAlanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M«*Methionine, N«Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y= Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRS YDSGSFT I MQEVySAFLPDGCDHLRDKLGD ' 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGERM 
LSTFKELFSEKHASIiQNSQRTVAEVQETLAEMIRQHQKSQIiCKS 
TANGPDKNEPEVEAEQPIiCSSQSQCKEEKNEPLSLPELTKRLTE 
ANQRMAKFPESIKAWPFPDVLBCCliVLLIilRSHFPGCLAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECIiDPAQRDLYRDVMLENYSNL 
ISLDLESSCVTKKLS P EKE I YEMES \ PSGR I WGNVSTI TFQYNG 
LGDNMECKGNIiEGQVS KSEGLYMCVKITCE E KATESHSTS S TFH 
RI I /HYQGKI VKCKE CRQG FS YLSCLI QHEENHNI * KCSEVNKH 
RNTFS KKPS Y I * HQ\ KFRLGEKPYECMECGKAFGRTSDLI QHQK 
IHTNEKPYQCNACX5KAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKP YE C KECGKAF I LG SHLTYHQRVHTGE KPYICKE CG KAFLCA 
SQLNE HQRI HTGE KP YECKECG KTF FRGSQLTYHLRVHS G ERPY 
KCKECGKAFISNSNIiIQHQRIHTGEKPYKCKECGKAFICGKQLS 
EHQRIHTGE KP FECKE CGKAF IRVA YIjTQHE KI HGEKHYE CKEC 
GXT F VRATQLT YHQR I HTGE KPYKC KECDKAF/HLWLT ILSEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFS RGSEHTLHQR I HTGEKP YTC VQCGKDFRCPSQLTQHTRL 
HN*EYSSHKICMHSIALASLDFAHLQEKNPEN 


5839 


1 


2425 


GRPFPRPPRAIjPRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAL 
EE VEGDVAE I*ELKL\ DKLVKLC I A\ M I DTG KAFCVANKQ FMNG I 
RD\IAQNS\NNDA\WETKFAPSFLDSLQEMINFHTIL/L*PNS 

ein*ghsfqnfvkedlrkfkdakkqfensq* KRKKIALVKNAPV 

PSRPASLEL * KP PNILTATRKCFRH IALDYVLQINVLQS KRRSE 
ILKSWLSFMYAHIiAFFHQGYDLFSEI^PYMKDLGAQLDRIjVGDA 
AKEKREM EC KHST I QQ KD FSRDDS KLKYNVDAAMG I VMEG YLFK 

RASNAFKTWNRRWFS iqnnqwyqkkfkdnptvwedlrlct vk 

HCEDIERRFCFEWSPTKSCMLQADSEKLRQAW I KAVQTS I \AT 
AYRBKDDESEXLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 
CI PGNAS CCDCGLADPRWAS INLGITLCIECSG IHRSLGVHFS K 
VRSLTLDTWEPELLKIiMCELGNDVINRVYEANVEKMGIKKPQPG 
QRQEKEAYIRAKYVERKF VDKI FL * SLSPP\EQQKK\ FVS KS SE 
EKRLS I S KFGP \GDQVRASAQSS VRSNDS GI QQS S DDGRES LPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPXM 
AEAIAHGADVNWANSEENKATPLIQAVLGGSLVTCEFLLQNGAN 
VNQRD VQGRG PLHHATVLGHTGQ VCLFLKRG ANQHATDEEGKDP 
LS I AVEAANAD I VTLLRLARMNEEMRE SEGL YGQ PGDET YQD I F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQISSPRWRSPQRAFMSALSKTQTQSAPALQ 
GLSSLLQSVTGNPVPASEAASQSTSASPANTTVYTIKGRKLPSS 
AQPFI PKSFNYS PNSSTSEVSSTSASKAS IGQSPGLPSTAFKLP 
SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKIi\ESESTSPSL 
\EMK IHNFLKGNPGFSVA* NLKHPNPAGSLGSS APS ESHP5DFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKIISPQSSTPSSTRSPPPGRDESYPREL.SNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLKDSSQEKFYPDTSFQEDEDYRDFEYSGP 
PPS AMMNLQKKPAKS ILKSS KLSDTTEYQPILS S YSHRAQEFGV 
j\.£>mr * x& v tCftjjjjUij £j UN UJKbb &S PGJjFGAFSV RGNE PGSDRS P 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
LFS PQNTLAAPTGHPPTSG VEKVLASTISTTST IEFKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQgBEHY 
R I ETR VS S S CLDL PDS TEE KGAPI ETLG YHSASNRRMSGEP I QT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 
SIiGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 
VDLSNPFTKEAALAHAAPPPPPGEHSGI PFPTPP PPP ? PGEHSS 
SGGSGVPFS TPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVF P 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 

Glutamic Acid FsPhp^vl al «a r*—i^~\ _ 
«k»omj.w. rv^AUj r-riicuyiaianiliB, ta—vjAyCine , 

H^Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=> Methionine, N«Asparagine, 
PaProline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=spossible nucleotide insertion) 








TIiPSHSLEHLGPPHGGGGGGGStfSsSGPPLGPSHRDTlSRSGII 

LRSPRPDFRPREPFl*SRDPFHQTtTTDDDD0175Dr«nOT7'C»TV nr^nnnn 

PPRY 


5841 


1908 


762 


GLRL FLVLTVW PMM KP S WLSRTE FS KRI» LCRTti WCQSG WS SRS Y 
TRSMLKMTTS INRRS RTS TKS TRT5 ARPGLTATVS IGLSDS PTW 
RHCWMTAR S CSGEKGGH WAPRQVGV Y LL PGRVGCVS S R VS PS FP 
GDGLDSGIiARRGSAVSAIASGLVBEPMLGPPFHPTPRFKAVSAK 
SKEDLVSQGFTEFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSD YLWYLRLLTSGYLQRES KFFEH F I EGGRTVKE FCQ\QE 
\VEPMCKESDHIHIIAIAC^LQRVHPGWBYMGPRPRAATTNPHI 
FP*GLPSPKVYI i LYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRG FS WVKWS YFTP FFLSHDP PPMFY 


5842 


307 


1918 1 QEPTADFKLRSTCGCX3REMTCPDKPG0LINWFICSLCVPRVRKL 
WSSRRPRTRRNL»LLGTACAI YLG FLVSQVGRASLQHGQAAE KGP 
HRSRDTAE PS F PE I PLDGTLAP PES QGNGSTLQPNW Y I TLRSK 
RSKPANIRGTVKPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 
KLMIj*HLGTLREQTWLRLESDPGGWCX3VRE/WRAGGFDFLQPSS 
RESNIRIYSESAPSWLSKDDIRRMRLLADSAVAGIiRPVSSRSGA 
RLLVLEGGAPGAVLRCGPSPCGLLKQPLDMSEVFAFHLDRILGb 
NRTLPSVSRKAEFIQDGRPCPXILWDASLSSASNDTHSSVKLTW 
GTYQQLIiKQKa^QNGRVPKPESGCTEIHHHBWSKMALFDFLLQI 
YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHIIQRKH 
DPRHLVF I DNKG FFDRS EDNLNFKLLEG I KE FPAS AVYVLKS QH 

LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVLPMNE 


5843 
5844 


500 


1453 CJTARIjVTCWVI^GQ*VKKPAWEPGVVWIi*Q*RCRPKGWGLGAGM 
RGSRMSQPPQCLRRAQS S CCHFMVKLLDDGTFM I PGEKVAHTSI* 
DALVTFHQQKP IE PRRELLTQPCRQKDPANVDYEDLiFIiYSNAVA 

1 EEAACPVSAPlTF'IXCD'irDVT.r'IJOC VO»Dirr»nTit»M /nnin«»«— 1 

1 * ° rtir - c " c *^»^ rxjf VJjv-HUoAERKPSAEM/RQNNHQGSHFL 

IiPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCBLWT 

LCSCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHISQKP 

LTAPGTKRQKGPHQEGREVGQLH+GDPRGQELAPNGSESPILPG 
VQARAPGLGRA 


5845 


202 


2471 FDS AVLSS I N VMAVIiPG PLQIiLG VLLTI S LS S I R Ij I QAGAY YG I 
KPLPPQIPPQMPPQIPQYQPLGQQVPHMPLAKDGIiAMGKEMPHIj 
OYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKEIPLASLRGEQG 
P RGE PGPRGP PGP PGLPGHG I PGI KGKPG PQGYPG VGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPMGIP*PQGPPGPHGliPGIGK 
PGGPGL PG QPGPKGDRGPKGL PGPQG1RGP KGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPG IGKPGQDG\ IPGQPGFPGGKGEQGL 
PGLPGPPGLPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 
j PKGEPGLQGFPGKPGFIiGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGP PGP PGP PG PPA VMPPTPP PQGEYLPDMGLGIDG VKP PHA YG 
AKKGKNGGPAYEMPAFTAEIiTAPFpPVGAPVXFNKLLYNGRQNY 
NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 

EYKKGFLDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
j SFSGYLLYPM 




215 


2061 HASNKKASl.QDKMANPKEKTAMCLVNEa^FNRVQPQYKLLNER 
GPAHSKMFSVQLSLGEQTWESEGSSIKKAQQAVGNKALTESTLP 
KPI*KPPKSNVNNNPGCITPTVEIiNGlAMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVMYNQRYHCP I PKIFYVQLTVGNNEFFGEGKT 
RQAARHNAAMKAIjQALQNEPIPERSPQNGESGKDMDDDKDANKS 

e islvfeialkrnmpvsfevikesgpphmks FVTRVSVGEFSAE 

GEGNSKKLSKKRAATTVLQELKKLPPLPVVEKPK\HPFKKRPKT 

ivkagpeygqgmnpisrlaqiqqakkekepdyvllsergmprrr 

EFVMQVKVGNEVATCTGPNKKlAKKNAAEAMIiQI^YKASTNtiQ 
[ DQLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Amino acid segment containing signal peptide " 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K« Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q^Glutamine, R*=Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y*=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, I 
\«=possible nucleotide insertion) 








RHKVISG TTLG YLS PKDMNQPS SS FFS I S PT5NS SAT I AR E LLM — 
NGTSSTAEAIGLKGSS PTPPCS PVQPS KQLE YLAR I QGFQ VHYC 
DRQSGKECVTCIiTLAPVQMTFHAIGSS IEASHDQV* YATAILLC 
YGP AR KWKA I KMEAMCAHAALLSL IHYLIAPSARLE KS KLFALG 
N- 


5846 


1126 




FSKLIKKTFIIGISGVTNSGKTTLAKNWKHIjPNCSVISQDDFF 
KPESEIETDKNGFLQYDVLEALNMSKMMSAISCWMESARHSVVS 

tdqssaeeipiliiegfliifnykpldtiwnrsyfltipyseckr 
rrstrvyqppdspgyfdghvwpmylkyrqemqditwewyldgt 
kseedlflqvyedliqelakqkclqvta*rrnttnps /ck+irk 

LQGVI 


5847 


2769 


505 


apemedi^spdstliwghnllssasfqbsvtfkdvivdftqee 

WKQLDPGQRDLFRDVTLEN YTHLVS IGLQVSKPDVISQIiEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPD1SEEELSPEVIVBK 
HKRDDSWS SNLLES WE YEGSLERQQANQQTLPKEIKVTEKTI PS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCN* /CVEKAF 

SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTQHQRIHTGE 
j KPYKCDECGKAFSORTHLVOHOH THTVctrxavTr'KTn'rv^v* nnA no 

HFMEHQKIHTGEKPFKCDECDKTFTRSTHIiTQHQKIHTGEKTYK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQKTHTGE KP YD CAE CGKS FS YWS SLAQHL K IHTGEKPY KCNEC 
GKAFS YCSSLTQHRR I HTREKPFE CSE CGKA FS YLS NLNQHQ KT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 
* SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECA3CX3KAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 
HLTQHQRI HTGE KP YKCNECDKAFS RS THLTQHQR IHTGBKP YK 
CNECG K\TFSQ S TYLIQHQRI HSGE KPFGCNDCG KS FR YR S ALN 
KHQRLHPGI 


584B 

• 


22 


2961 


AAPRRLLRGGDGDRTPR FPLP ALLRPG P PAEAAP ERRKM PAVS K 
GDGMRGLAVF I SD I RNCKSKEAEIKR INKELANI RSKFKGDKAL 
D3 YS KKK Y VCKLL FIFLLGHD ID JPGHMEA VNLLSSNTR YTEKQ IG 
YLFISVLVNSNSELIRLINNAIKNDLASRKTPTFMGLALHCIASV 
GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDL 
VPMGDWTSRVVHLl^NDQHLGVVTAATSLITTLAQKNPEEFKTSV 
SLAVSRLS\R1VTSASTDLQDYTY*FCPGFLGLSVKLLRLLQCY 
PPPDPAVRGRLTECLETIIiNKAQEPPKSKKVQHSNAKNAVLFEA 
ISLIIHHDSEPNLLVRACNQLGQFLQHRETNLRYLALESMCTLA 
S S EFS KEAVKTH 1 ETVINALKTER0VS VRQRAVDLL YAMCDRSN 
A?QI VAEMLSYLETAD YS IREE I VLKVA I LAEKYAVD YTW\ YVD 
TILNLIRIAGDYVSEEVWYRVIQIVINRDDVQGYAAKTVFEALQ 
APACHENL VKVGG Y I LGEFGNL I AGDPRS S PLIOFHLLH 3 

CSVPTRALLLSTYIKFVNLFPEVKPTIQDVLRSDSQLRNADVEL 
QQRAVEYLRLSTVASTDILATVLEEMPPFPBRESSILAKLKKKK 
GPS TVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPS PSADLLG 
LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 
RFVCKNNGVLFENQLLQIGLKS EFRQNLGRMF I FYGNKTSTQFL 
NFrPTLICSDDLQPNLNIOTKPVDPTVEGGAQVQQWNlECVSD 
FTEAPVLNIQFRYGGTFQNVSVQLPITLNKFFQPTEMASQDFFQ 
RHKQLSNPQQEVQNIFKAKHPMDTEVTKAKIIGFGSALLEEVDP 
NPANFVGAG 1 1 HTKTTQ I G CLLRLE PNLQAQMYRLTLRTS "<EAV 




3545 T 


1895 

J 
t 

< 

I 
c 
I 


KRREIKETVFHHVAQAGLELLSSSNPPSSASRSAmTGMRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGG1EVEES 
□EFI REDMKYKDATNKHSHLHREDKHI TI EDLWKRWKTSEVHNW 
rLE DTLQWL I E FVELPQ YEKNFRDNNVKGTTL PR IAVHE PS FMI 
3 QLK t S DRS HRQKLQLKALD WLFGPLTR P PHNWMRD F I LT VS I 
/IGVGGCWFAYTQNKTSKEHVAKMMKDLESLQTAEQSLMDLQER 
jEKAQEENRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 

:elsrrqyaeqeleqvrwalkkaekefelrsswsvpdalqkwlq 
jTHEVEVQYYWI krqnaemqlaiakdeaekikkkrstvfgtlhv 
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ID 
NO: 


Predicted 
beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

anilrtrj arid 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=*Histidine, I»lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

r — trjL^jx nit:/ v^^-lj»jlu Larnine f K^A3TCfl»nill© / 
SaSerine, T«Threonine , V«Valine, 
W»Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion? 








AHSSSLDEVDHKILEAKKALSELTTCLRERLFRWQQIEktCGFQ 
IAHNSGLPSLTSSLYSDHSWWMPRVSIPPYPIAGGVDDLDBDT 
PPIVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EEAIYFSABKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 
IXS/DERYQEMRCP*RI PSGGIL 


5850 


3 


1895 


KAVLNFSASGSVISLTGSNPMHDASMWHLKKNGIIVYLDVPLLN " 
LI CRLKLMKTDRI VGQNSGTSMKDLLKFRRQYYKKW YDARVFCE 
SGASPEEVADKVLNAI KRYQDVDSETFI STRHVWPEDCEQKVSA 
BFFIEAVIEGLASDGGLFVPAKEFPKLSCGEWKSLVGATYVERA 
QI LLERCIHPADIPAARLGEMI ETAYGENFACS KIAPVRHLSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQIIGSQ 
RENGWAVGVES DFDFCQTAI KR I FNDSDFTGFLTVEYGTI LSSA 
ICS I NWGRLL PQ WYHASAYLDLVSQG F I S FGS PVDVCI PTGNFG 
K I LAAVYAKMMG I P I RKFI CASNQNHVWTDF IKTG \HYDLRGKB 
N*AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAK WADRVQDKTCP VI I SSTAH YS KFAP AI MQALKI KB I 
NETS S SQL YLLGSYNALPPIiHEALLERTKQQEKMEYQVCAADMN 
VLKSHVEQLVQNQFI 


5851 


3120 


| 1802 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGSPSVMTRAGDHNRQ" 
RGCCGS LAD YLTS AKFLLYLGHS L S TWGDRM WH FAVS VFLVEL Y 
GNS LLLTAVYGL WAGS VLVLGA I IGDWVDKNARLKVAQTS LW 
QWSVILCGI ILMMVFLHKHELXTMYHGWVLTSCYILI ITIANI 
ANLAS TATA I T I QRDWI VWAGEDRS KLANMNATI RR I DQLTN I 
LAPMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVLLWKVYQKT 
PALAVKAGL KEEETELKQLNLHKDTE PXPL EGTH LMGVKDSNIH 
ELEHEQEPTCAS QMAEPFRTFRDGWVS YYNQPVF/ LGWHGSCFP 
LYDCPGL* LHHHRVRLHSGTEWFHPQYFDGS I S YNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVR6YSGQPLTDPLISLCRSHKCRGKGWG 
SSSYPSLPALLRARSAPGHCTHRSCGPEWRIDSISRLEMQGARR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGFFITSGPG/WFRQ 
YYFFI SGRH+ VLFTBSDFY YVAMDFGGHGL3SKYS PGVPYYLQT 
F VS BIRR WAG KKQSVYFRRCGGCS RAP PLITGGGVGSRKQRWP 
ESGAWALAPGLPAIHGRSWES 


5853 


223 


1346 


RLLGLS RVKGLHG PAASAW I SD PETRGD PGG P WGMWRG&DLR PR 
P VSLTGLTLVCK * AAQGPQ V \ HS VKLCFGLGG \P CLL\ FP I FR P 
LLLHPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGI RAAGPGH 
GGVE I PQG/VGS LGARRGLRPSRPS SRHRNRVPAPP PGRPLATP 
HRRR FP PD PALTCPGLGQDQGPREQQKQGSGRHDT I LGDWGE S E 
ortwvRWivf xv. avj i/vll ijiijroKN£ J I LiNGSENWGSLVSIQEEGPDT 
GWE REKRNP AEMGNPQRWAS P I HTP PLGPE I LRAM PEALRAM PE 
ALGLRPDPATSVPS ALS / QT F / PE SWPRS CLRNQGETLGMG P VP 
LSSLCITES PSQNWTPCLLLLTCPRGLF 


S854 


86 


93 8 


KGRNTAPEKKGAALNNRENASS*NGY/SRWkQDIRRIENHIIQE " 
LXHLCAMIKRVLLERLENTRKLRELTEGRTLDWPQNRITEVSAK 
RQI VTE YREKGKRN* EEKKRDLEGRSRRYNLCI IG IPETEDRAS 
GAETI KDLLE/ENFPELKNELDLQMEKAHRI PLKFNEKKAASRH 
IRVTFL/ KFQRRN I LQASSQR KQVTYKGAKVRLTS DFS PAILNA 
RRQW/N/PISRVLRENNFEPRI I YSAKLSFLYKGNWKTFLDIQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 

] 


LRSYGCKAPSRISHLHK\FLPLLLPSLLMGYSESPPPITDSWAP 
FISLTHHVLSQSQSPLSSNCWI CLSTHTQ* FTALPADLLTWTQS 
NVSLHISYLAIPFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 
GRAVALLHLIASGLTSIQTNTASSKPPIWGY\LSTQTSFISPPP 
LCLSRTYPNPAHATMVGQVPQSLCGLIFTL /RTPCRPS ILHPNY 
KI I STS AWQKVLCFS GS PTIHTSLHLTTG SS FLS FHP IPG FPAA 
NSALYVSSLJCGPPGKNVTI PS PVTGT*QP PHRGSN /RLT VDKDN 
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SEQ 
ID 

NO: 



5856 



5657 



5858™ 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



173 



1597 



355- 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1137 



1419 



5859 



5860 



5861 



"307- 



2956 



2051 



1503 



1270 



'1305 



Amino acid segment containing signal peptide ' 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K«Lysine, 
L» Leu cine, M=Methionine, N-Asparagine , 
P=Proline, Q»Glutamine, R«Arginine, 
S=Serine, TsThreonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=sStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
fflspkpnslhqlpsqnt'pVqaltgaalagsypiwenentlswl 
ptfrynfclstpslfflcdtn*ylclpanwsgtctlvpqaptin 

I LP PNQT I L I S VEAS I S S£ P I RNKWALH L I TLLTGLG I TAALGT 
G I AG I TTS I TS YQTL FTTLSNTVEDMHTS I TS LQRQLDF LVG V I 
LQNWRVLDLLTTEKGGTCIYLQEBCCFCVNESGIVHIAVRRLHD 
RAAEL*HQVADSWWQGSSLLRWIPWVAPFLGPLIFLFLLLMIGP 
CI FNLVS R F I SQRLNCF I QASMQKH I DN I FHLCHV * YQS LRGNH 
SEAPEPRP 

PWLHGLGLSAVFLF YL* / Y VTFHL YGG I 1 LLLLIFI S I AGILYK 
FQDVLLYFPEQPSS SRL Y VPMPTG I PHBNI FIRTKDGI RLNLJ I* 
IRYTGDNSPYSPTIIYFHGNAGNIGHRLPNALLMLVNLKVNLLL 
VDYRGYGKSEGEASEBGLYLDSEAVLDYVMTSPDLDKTKIYLSG 
RS LG\GAAAIHLASDNSHRISAIMVENTFLS I PHMASTLFSFFP 
MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPVMKKQ 
LYELS PSRTKRLAI FPDGTHNDTWQCQGYFTALEQFI KEWKSH 
SPEEMAKTSSNVTII 



KL IGKVLVLS WADAMAAFAVB PQGP ALGS EPMMU3S PTS PKPG 
VNAQFLPGFLMGDLPAP VTPQFRS I SGPS VGVMEMRS PIiLAGGS 
PPQPWPAHKDXSGAPPVRSIYDDISSPGLGSTPLTSRRQPNIS 
VMQS P L VGVTS TPGTGQSMFSPAS I GQPRKTTLS PAQLDP F YTQ 
GDSLTSEDH\LDDSWGDCIWGFLKASA\SYILli\QFAQYGGIS* 
NMWMSNTGNWMH I R YQS KLQARKALSKDGR I FGE S I K IGVKPCI 
DKSVMESSDRCALSSPSIAFTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTS D YQ V I S DRQTPKKDE SLYSK AME YMFG W 

PPHQPAAASTSXHQQQQPPPPPQDSSKPWAQGPGPAPGVGSAP" 

PASSSAPPAXPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 

PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 

GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 

GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 

LLGIYliLISRRMNSRRLFAKIWENQEKFI*STKAKDSEFIKI»ESR 

ALA*NCPKPBLG*YTP*GGRQLPSSLFPTHACLP1*SCSVIFSPF 

KFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCriN 
FAS ■ 

GGSSARPRASSRRMLSRKKTKNEVSKPAEVQGKYVKKETSPLLR 
NLMPS FIRHGPTI PRRTDI CLPDSS PNAFSTSGDG WSRNQS FL 
RTPIQRTPHEIMRRSSNRLSAPSYLARSLADVPREYGSSQSFVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASGIGRVAATSLGNLTNHGSEDLPLPPGWS 
VDWTMRGR KY Y I DHNTNTTHWS HPLEREGLPPG WER VESSEFGT 
YYVDHTNKKAQY\RHPCAPTCTSV*STTSCHI/AS/RQQTERNQ 
SLLVPANPYHTAEIPDWLQVYARAPVKYDHILKWELFQLADLDX 
YQGMLKLLFMKELEQIVKMYEAYRQALLTELENRKQRQQWYAQQ 
HGKNF 



TIRVEEFPLCPGGGKAQLSSASLLGAG LLLQPPTPPPLLLLLFP 



TLHNIG FSDSGKYICKAVTFPLGNAQSSTT VTVLVEPTVSLI KG 

PDSLIDGGNETVAAI CI AATGKPVAHIDWEGDLGEMESTTTS FP 

NETATIISQYKLFPTRFARGRRITCWKHPALEKDIRYSFILDI 

QYAPEVSVTGYDGNWFVGRKGVNLKCNADANPPPFKSVWSRLDG 

QWPDGLLASDNTLHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 

VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRO\LATEP*KIA 

PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIF 

CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 

KKENKNPVNNLIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 

KMGMKFVSDEHYDENEDDLVSHVDGSVISRRE WYV 

EVCACVQAFWLVASSGDDSQGGDKCGC EVGSWVGSMRWMAfeLL 

SEGEQGIPTACAAFAQQPAG/EPRRGLAGVGEGGPQCSVA^NYRC 

TLEFLVSLLGTDLARGRGNSASGPTAPADSKQL/ML*DVHRRVI 

LE* RMNSGSPARDNAPSQRFCTNLSEGLRFG IS PSWREAliYGCH 
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Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, Rs=Arginine, 
S^Serine, T=Threonine, V* Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\apoasible nucleotide insertion) 



SEQ 
ID 

NO: 



-SQB2 



5863 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1556 



2714 



5864 



TtT" 



483 | PPFQL IMGE I KVS PD YNWFRGTVPLKKI I VDDDDSKI WS LYDAG 

PRS I RCPL I FLP P VSGTADVFFRQ I LALTGWG YRVI ALQ YP VYW 
DHLEFCTCFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVHSL I L CNS FSDTS I FNQTWTANS FWLMPA FMLKKI VLGNFSS 
GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 
RDIPVTIMDVFDQSALSTEAKEEMYKLYPNARPJtfiLKTGGNFPY 
LCRSAE VNL YVQ I HL/ R / RNS ME PNTR PLTHQWS VPRS LRCR KA 

ALASARRSSSVSIiAVNDELTRCVLV*SVASAPVSRPFPSGSSGS 
PVL TVSGK 



249 



PF P SRGS LPLAAPREDTMG PLMVL FCL LFL Y PGLADS AP S CPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGLYPSPASRLCKSSGQWQ 
TPGATRS LS KAVC KP VRC PAP VS FENG I YTPRLGS Y PVGGNVS F 
ECEDGFI \LRGS PVRQCRPNGMWDGETAVCDNGAGHCPNPG I Sli 

gp\vrtgfrfghgdkvryrcssnlvltgsserecqgngvwsgte 
P I crqp ys ydfpedvapalgtsfshmlgatnptqktkeslgrki 

QIQRSGHLNIiYLLLDCSQS VSENDFLI FKESAS LMVDRI FS F3I 
NVSVAI rTFASEPKVUVlSVLNDNSRDMTEVISSLENANYKDHSN 
GTGTNTYAALNSVYliMMNNQMRLtiGMETMAW\QE IRHAI ILL\T 
DGK\SHMGGSPKTAVDHIREILNINOKRNDYLDIYAIGVGiabDV 
DWRELNELGS KKDGERHAFI LQDTKALHQVFEHMLDVSKLTDTI 
CGVGNMS ANASDQERTPWHVT IKPKSQET\C\RGALI SDQWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDVFA 
KKNQGIL\EFYGD\DIALL\KIjAQKVKM\STHCQGPSC3jP\CTM 
\EANLGFLRETFKGSTCR\DHENEL/VXWKQSV\PAHF\VAIj\N 
GSKLEHLTIiRMGVEWTSCCRGLSPKICKTM\FPNI,T\DVRB\VVT 
D\QFL\CS\GPQEDESP\CK*E\SGGA\VFL£KRFKJ j SAGGVWC 
SWGL\YNP\CT,GSA\DKNSPKXGPSVAKVPPPTR/DFHIN\LFP 
Q* S PWLRQHPGGMS * 1 FLPLLANGHLS P FACPARI CRPLKFLPS 
EWATLRTL 



1013 



PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPSIPPSSPLACVLKNLKPLQLTPDIjKPKCLIFFCNTAWPQY 
KLDNDS K* PENGTFEFS ILQVLDNSCHKMGKWSEVPDVQAFF \ S 
HWSLPSLCSQC/GLI PNLSSFSPFCSFG/PPPQVPSP/TES FFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 



-56T 



5866 



98 



1684 I CL PG PRWGEG WRAGHT I VGC I FFKTAI Z SHFKGGM YLCVCMCTC 

LSVCVCVQVGSWI CV/CVSMCACVSLCTC\ ICRC ISMYTREHAC 
ACTRV + VYMCMS / VCTCVS TC IDVRVCAHVCVYMCLCLG YA * AC 
TCV*MCVCMh^HVC^C/VCACSCVLL/CRGHICM/MCMSAYICI 
/C\^VCVLCVWACMRMSTCVWLVYG*ACTCVWMHM/CSCTCR/C 
VHVCCMSMHACECLCVYIiHI CGCAGTRRWWAGSARGSRSCSRLP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
GSPAAVCSRNCTVS P RRGADC FEAPDVP KQ PPG WGRAS FEE RG C 
GGRGWVCAPPLKGPQCCCFSIKPELKAKKKK 
3197 ~ I ARPEVPA P PAWLS RRGAAKMGDKKDDKDS P KKNKGKERRDL DDl7 

KKEVAMTEH KMS VEE VCRKYNTDC VQGLTHSKAQE I LARDG PNA 
LTPPPTTPEWVKFCRQLFGGFSILLWIGAILCFLAYGIQAGTED 
DPSGDMLYLGIVLAAWI ITGCFSYYQEAKSSKIMESFKNMVPQ 
QALV I REGEKMQVNAE E VWGDli VEI KGGDRVPADLR 1 1 3 AHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFF5NNFVEGTA 
RGWVATGDRTVMGRIATLASG1»EVGKTPIAIEIEHFIQIjITGV 
AVFLG VS FF I LS LI LG YTWL E AVI FL IGI I VANVPEGLLATVTV 
CLTT/TAKRMAk KM C LVKNUSAVE TLGST S T I CSDKTGTLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTSFDKSSHTWVALF*H/LLGFC 
NRPVFKGGQDNIPVLKRDVAGDASESALLKCIELSSGSVKLMRE 
RNKKVAE I PFNSTNKYQLS IHETEDPNDNR YLLVMKGAPERI LD 

RCSTILLQGKEQPLDEEMKEAFQNAVLBLGGLGERVLGFCHYYL 
PE EQFPKG FAFD CDD VN FTTDNLC F VGLMSMI GP PRAAVPDAVG 
KCRS AG I KV I WVTGDHPI TAKAI AKGVG 1 1 FEGNBTVEDIAARL 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amino acid segment containing signal peptide 
(A=Alanine, .C= Cysteine, DeAspartic Acid, E=: 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N«Aeparagine, 
P=Proline, Q=Glutaraine, R-Arginine, 
S^Serine, T-Threonine, V=Valine, 
N=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NIPVSQVNPRDAKACVIHGTDLKDPTSEQIDEtX,QNHTEIVPAR 
TSPQQKLIIVEGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGr 
AGSDVSKQAADMILLDDNPAS IVTGVEEGRLIFDNLKKSIAYTL 
?SNlPEITPPLLPIMANIPLPLGTITILCIDLGTDMVPAISIiAY 
EAAESD I MKRQPRNF RTDKLVNERL I S MA YGQIGM I QALGGFFS 
YFVILAENGFLPGNLVG I RLNWDDRTVNDLEDS YGQQWTYEQRK 
WE FTCHTAFFVS I VWQWADLi 1 1 CKT RRNS VFQQGMKNKI L I F 
GLF3ETALAAFLS YCPGMDVALRM YPLKPS W WFCAFP YS FL I FV 
YDEIRKLILRRNPGGWVEKETYY 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSSPVAKP ' 

GPVKTLTRKKNKKKKRFWK3KAREVSKKPASGPGAWRPPKAPE 

DFSQNWKALQEWIitiKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 

ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTXASGT 

EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 

WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSI*VKEQAFG 

GLTRAIAIiDCEMVGVGPKGEESMAARVSIVNQYGKCVYDKYVKP 

TEPVTDYRTAVSGIRPENLKQGEELEWQKEVAEMLKGRILVGH 

ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 

I LGLQVQQAEHCS IQDAQAAMRL YVMVKKEWESMARDRRPLL»TA 

PDHCSDDA*QSCPAAAAAPLQRQCDQSQGQITSPQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


833 


LTAGASHTQDASQSTS AKYPAAAQNli/ CVTWAMREDIiADI WYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALIiVSSTSWTEDEDFSlLLAA 
LESRV*r\MTLDGHNLPSl»VCVITGKGPLREYYSRLlHQKHFQH 
IQVCTPWI^AEDYPLLLGSADLGVCLHTSSSGIiDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEEIAAQIiQMLFSNFP 
DPAG KLNQFRKNIjRES QQLRWDES W VQTVLPIiVMDT 


5869 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMREDLADI WYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTS WTEDEDFS ILLAA 
LESRV*T\MTLDGHNIjPSLVCVITGKGPIiRBYYSRLIHQKHFQH 
IQVCTPWLFJ\EDYPLLLGSADI^VCLHTSSSGLDLPMKVVDM^G 
CCLPVCAVNFKCLHBLVKHEENGLVFEDSEELAAQLQMIiFSNFP 
DPAGKIiNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


bTAGASHTQDASQSTSAKYPAAAQNIj/ (^TTNAMREDLADI WYHT~ 

AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 

TERSAFTERDAGSGtiVTRLRERPALLVSSTSWTEDEDFSILLAA 

I*ESRV*T\MTLDGHNLPSLVCVITGKGPl,REYYSRI,IHQKHFQH 

IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 

CCI»P VCAVNFKCLHEL VKHE ENGLVFEDS EELAAQLQMLFS NFP 

DPAGKLNQFRKNLRESQQLRWDESWVQTVLPIjVMDT 


5871 


3 


3465 

] 

• 

] 
I 
1 
I 


FFFC RP LRXi YS KTTGDRS AMAGAAGLTAE VS WKVIikRRARTKRS 

VLKLL* LS LRRL * LE PT I * NGLLT* CSRLS VFR FLKV\GS VYE P 

LKS INLPRPDNETLWDKLDH YYRI VKSTLLLYQS PTTGLFPTKT 

CGGDQ KAK I QDS L YCAAGAWALALAYRR IDDDKGRTHELEHS AI 

K<^RGILYCYMRQAJDKVQQFKQDPRPT7CLHSVFIfVHTGDELLS 

YEEYGHLQINAVSLYIiLYLVEMISSGLQIIYNTDEVSFIQNIiVF 

CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 

L * KQFNGFNLFGNQGCS WS VI FVDLDAHNRNRQTLCSLLPRESR 

SHNTDAALLPCISYPAFALDDEVLFSQTLDKWRKLKGKYGFKR 

FLRDGYRTSLEDPNRCYYKPAEIKLFDGIECBFPIFFLYMMIDG 

VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 

KNNPGSQKRFPSNCGRDGKLFLWGQALYI I AKLLADEL I S PKDI 

D P VQR Y VP LKD QRNVS MRFSNQG PLENDLWHVAL I AE SQRLQ V 

FLNTYG I QTQTPQQVEP I QI WPQQEIiVKAYLQLGINEKLGLSGR 

?DRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLIiID 

5IKNALQFIKQYWKMHGRPLFLVLIREDNIRGSRFNPILDMLAA 

^KKG I IGGVKVHVDRLQTLISGAWEQLDFLRI SDTEELPEFKS 

'EELEPPKHSKVKRCSSTPSAPELGQQPDVNISEWKDKPTHEIL 

JKLNDCSCLASQAILLGILLKREGPNFITKEGTVSDHIERVYRR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide * 
location 
corre spending 
to first 
amino acid 

i. CJ -L U *Z \J J- 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alaxiine, C»Cysteine, D=Aspartic Acid", B= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
SssSerine, T=Threonine, V=»Valine, 
W=Trypt ophan , Y^Tyrosine, X=UnJcnown, * =Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








AGSQKLWS WRRAASLLS KWDS LAPS I TNVL\/QGKQVTLGAFG 
EEEE V ISNP JUS PR V I QN 1 1 Y YKCNTHDE REAV I QQE L V I HI GW I 
I SNNPE LFSGTLKIRIGW I IHAME YELQI RGGDKPALDLYQLS P 
SEVKQLLLDILQPQQNGRCWLNRRQIDGSLNRTPTGFYDRVWQI 
LERTPNGIIVAGKHI.PQQPTLSDMTMYEMNPSLLVEDTLGNIDQ 
PQYRQIWELLMWSIVLERNPELEFQDKVDLDRLVKEAFNEFQ 
KDQSRhKEI EKQDDMTSF YNTPPLGKRGTCS YLTKAVMNLLLEG 
EVKPNNDDPCLIS 


5872 


68 


665 


VQGYMYRFVI KI NSC YSEKTS I CRHRCCPELPATQPWPTPTVFF 
NIAIDSESLGCI\SFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYRI IPG\LCXX5GDFTHHNGTGGKSL YS KEFDDENFI /DKH 
TAPGVLSTANAGPTTNGSQFF I CTAKTEDG * QHWFGKVKDGM5 
IVEAIiERSGSRNGKTSKKITAANCGQIj 


5873 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSIiALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKIACCYGWRRNSKGVCEATCEPGCKF 
GECVGPNKCRCFPGYTGKTCSQDVNECGMKPRPCQHRCVNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCLCP 
S SGIiRLAPNGR DCLD IDE CASG KVI C P YNRRCVNTFGS Y YCKCH 
IGFELQYISGRYDCIDINECrMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENS VKE VLRAPGTI KDRI KKLLAHKNSMK 
KKAKIKNVTPEPTRTPTPKVNLQPFNYEEIVSRGGNSHGG\KKG 
NEEKMKEGLEDEKRBEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGLI L\ VQRKALTS KLEHKADLNI S VDCSFNHG \ I CDW \ KQDR\ 
EDDFDW\ NPADR \ DNAI \GFY\MAVPGLWQGHK\ KDIGRLKLUL 
PDLQPQS NFCliLFD YRIiAGBKVGKLRVFVKNSNNALAWEKTTS E 
DEKWKTGKIQLYQGTDATKSIIFEAERGKGKTGEIAVDGVLLVS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRIARRRRRVRSIiRJRRRGWLRARWSRGQNKMAARRITQETFD 
AVLQEKAKR YHMDASGEAVS ETIiQ FKAQDLLRAVPRS RAEM YD D 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPS I SDD 
SYFRKECGRDIjEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 
EQDFGH P VS QES S WS QE YS FGP SAVLGDFGSSRL I E KE CLE KE \ 
SRDYDVDHSG\BA\DSVLRGS\SQVQA\RGRALNIVDQEGSLLG 
. KGETQGLLTAKGG VG KLVTLRNVSTKKI PTVNR I TP KTQG TNQ I 
QKNTPfl PDVTCjGTNPGTED I QFPIQK I PLGLDLKNLRLPRRKMS 
FDIIDKSDVFSRFGI EIIKWAGFHTI KDDIKFSQLFQTLFELET 
ETCAKMLASFKCSLKPEHRDFCFFTIKFLKHSAIiKTPRVDNEFI, 
NMLLDKGAVKTKNCFFEI I KP FDKY IMRLQDRLLXS VTPLLMAC 
NAYELSVKM KTLSNPLDLALAL3TTNS LCRKSLALLGQTFSLAS 
S FRQE K I L * AVGLQD IAPS PAAFPNFE DS TLFGRE Y I DHLKAWL 
VSSGCPLQVXKAEPE PMREEE KMI PPTKPE IQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVI BGSLS PKERTLLKEDPAYWFLSDEN 
SLEYKYYKLKIiAEMQRMSENLRGADQKPTSADCAVRAr-IIiYSRAV 
RNLKKKLLP \ WQRRGLLRAQG \ LRG\WKARRA\TTGTQTLLFI,R 
APGLKHHGRQAPGLS\QAKPSIjPDRND\AAKD\CPI*DPV\GPSP 
Q DPS LEAS GPSPKPAGVDI S EAPQTSS PCPSADIDMKDNGRTAE 
KLARFVAQVG\PBIEQF\SI\ENSTDNPDLWFL\HDQNSS\AFK 
F5T\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 

GPSLEGSTPADGLPGEA\AEDDL/ALGAPALFTOLLQVTCFPFG 
RGFSSKSLKVGMIPAPKRVCLTnPPKVHFPVP Ta.vnD t>or»ot3R/ c* 

KKKKPKDLDFAQQKL\TDK\NLGFQ\MLQKMGWKEGHGLGSLGK 
G IR \ S RS ACTQQAAWGGSGWG LS PS TCS L PLG S FTAKMAYS WQ L 
IFVF 


5875 


296 


1848 


LAALGGL PLWRLSRRG F RE YLLGLS APSAI/3GAMRS VS Y VQR VA 
LEFSGSLFPHAICLGDVDNDTLNELWGDTSGKVSVYKNDDSRP 
WLTCSCQGMLTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VLDASGHHETLIGEEQRPVFKQH I PANTKVML I SDIDGDGCREL 
WG YTDR WRAFRWE ELG EGP BHLTGQLVS LKKWMLEGQ VJUSLS 
VTLGPLGLPELMVSQPGCAYAI LLCTWKKDTGS PPASEGPTDGS 
/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 
NO: 



5876 



5877 



5878" 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1122 



224 



2030" 



1507 



950 



2113 



5879 



981 



5880 



113 8 



1324 



"26" 



441 



Amino acid segment containing signal peptide" 
(A-Alanine, (^Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I»lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X*=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\«pos3iblc nucleotide insertion ) 

GLFALCTLDGTIiKLMEEMEEADKIiLWSVQVDHQLFALEKLDVTG" 
[EEWACAWnraTPV T T nuMDT'mroTjr»Tmc.vTT«-, — 



l- n ^ x x AJtuuwjiiiPijtsis^jtUjIjWSVQVDHQLPAl^KLDVTG " 
NGHEE WACAWDGQT YI X DHNRT WRFQVDENIRAFCAGLYACK 
EGRNSPCLVYVTPNQKI YVYWEVQLERMESTNLVKLLETXP \ST 
TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
WTCLIAGEGPF»TP1XPPKGVFGSHCAAAGSIT KQ 
HLi PLG VPS K VAG AAAME P QE ERE TQV AAWLKKI FGDH P I ^QYE V 
KPRTTE 1 LHHLS ERNR VRDRD VYL VTEDLKQKAS E YES EAKY LQ 
DLLMES VNFS PANLS S TGSR YLNALVDS AVALETKDTS LAS F I P 
AVNDLTSDLFRTKSKS EE I KIELE KLEKNLTATLVLEKCLQEDV 
KKAELHLSTER\AKVDNRRQNM\DFLKAKSEEFRFGIQAAGEQL 
SARGQ\DAFSVPIQSLVALIRBNWPRLKQQTIPLK\KKIiESYLD 
LMP\KPSHCSK^RIEEAK\REIA\SIEAELTRRV3\ MMEL 
GTLGKMAASSSGEKEKERLGGGLGVAGGNSTRERLLSALEDLEV 
LS R E L I EMLA I SRNQ KLLQAGEENQVLELL I HRDGE FQELM KLA 
LNQGK THHEMQVLEKE VEKRDSD I QQLQKQLKE AEQ I LATAVYQ 
AKEKLKSIEKARKGAISSEEIIKYAHRISASNAVCAPLTWVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDAIA/RRKIAR 
CPCSTVS/NGSQMTCR*INIILILQKSVCEL 



GLWKCMQLgG PHTHRVQP » PTPRQQ GPQ \ VPVAVIAGNRPNYLY 
RMLRSLLSAQGVSPQMJTVFIDGYYEEPMDWADFGLRGIOHTP 
IS I KNARVSQHYKASIjTAT FWLFPEAKFAWLEEDLD I AVDFFS 
FLSQS IHLLEEDDSLYC I SAWNDQG YEHTAED PAI»LYRVETMPG 
LGWVLRRSLYKEELEPKWPTPEKXWDWDMWMRMPEQRRGRECII 
PDVSRSYHFGIVGLI^GYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAY E VEVHRLLS EAB VLDHS KNPCEDS FLPD TEGHTYVAF IR 
ME KDDD FTTWTQLAKCLH I WDLD VRGNHRGLWRLFRKKNHFLVV 
GVPASPYSVKKPPSVTPI FLEPPPKEEGAPGA PEQT 
RLTEAAAAGaGSRAAGWAGSPPTliPLSPTSPRCAATMASSDED 
GTNGGASEAGEDREAPGKRRRUSFIiATAWLTFYDIAMTAGWLVIi 
AI AMVRF YME KGTHRGLYKS I QKTLK FFQTRALLE 17VHCL I G I V 
PTS VI VTGVQVSS R I FM VWLITHS I KP IQNEES WLFLVAWTVT 
EI TR YS F YTFSLLDHLP YFI KWARYNFFI I LYPVGVAGELLTI Y 
AALPHVKKTGM FS IRLPNKYNVS FD Y Y YFLL I TMAS YI PL FPQL 
YFHMLRQRRKVLHG \ G * L * KRMI K* S LQTRCFFQNNQD YLS PS F 
NNKNKQLCEISWIVWFLKI 

S LWCL VAGGLGLG PS S QNPLQRAG I LAR PREARGTFS ALTACS A " 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
* KKKRGRCSS/ WLSQPQHEREKEVVLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQSEHTDGHTSVQSVIEKLQEENRLLKQKVTHVEDLNAKWQRYN 
ASRDE YVRGLHAQLRGLQI PHEPELMRKEISRLNRQLEEKINDC 
A3VKQELAASRTARDAALE R VQMLEQQ I LAYKDD FMS ERADRER 
AQSRIQELEEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYLA 
ADALELPTVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEELLRHVABCCQ 



5882 



2407 



2216 




~« v <v<vxr VA o vi'ivoti v XTLTKLSMHWVRQAPGKGLE*MGP FD 

LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSIiRSEDTAV 
HHCATDTV 

SGCVEMLYSH5LEYNPEWIS VQS AVAP AQLALNSDGDL* LHSGE " 
RTRRD * QLPSAGGPGIiQ EPLQ LGELDI 7SDE FI LDE VDG\ VDLR 
HYSKQVELELQQ I EQKS I RD Y I QESENIAS LHNQ 2 TACDAVLER 
MEQMI^AFQSDLSSISSEIRTLQEQSGAMNIRLRKRQAVRGKLG 
ELVDGLWPSALVTAILEAPVTEPRFLEQLQELDAKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKIRE FILQKI YS FRKPMTNYQ 
IPQTALLKYRFFYQFLLGNERATAKEIRDBYVETLSKIYLSYYR 
SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTI F 
TLGTRGS V I S PTELEAP I LVPHTAQRGEQR Y PFEAL FRSQH YAL 
I^NSC^EYLFICEFFVVSGPAAHDLFHAVMGRTLSMTLKHLDSY 
LADC YDA 1 AVFI^ IH I VXiRFRKI AAKR DVPALDR YWEQ VXALL W 
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SBQ 
ID 

NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Pre f Lc t*i enCl 1 Amin ° acid 3 ^mcnt containing signal peptide 1 
nucleotide (^Alanine, C=Cysteine, D-Aspartfc Acid, E= 
location Glutamic Acid, F-Phenyl alanine, G=Glycine, 
corresponding H-Histidine , I-Isoleucine, K«Lysine 
to first L=Leucine, Methionine, N=Aeparagine, 
amino acid P= Proline, Q=Glutamin e/ R-Arginine 
residue of S«Serine, T=Threonine, V=Valine, 
amino acid W<=Trypcophan, Y=Tyrosine, X=Unknown, * sS top 
sequence Codon, /-possible nucleotide deletion, 
_-- \=possible nucleotide insertion) 


5883 




^K^bXI,EMNVQSVKSTUP(jRLGGLDTRPHVlTieRyAEFSSALV~ 
SINQTI PNERTMQLLGQLQVEVENFVLRVAAEPSSRKEQLVPLI 
NN YDMMLG VLM \ E * ERAADDS KEVES FQQLLNARTQ E PI E ELLS 
P PFGGL VAFVKE AEAL I BRGQAERLRG E EARVTQIi I RGFGS S WK 
S SVES LS QDVM RS FTNFRNGTS 1 1 QGALTQL I Q\ LYHRFHR V \ L 
I SQPQLRALPARAELINIHHLMVRT.KJfWjrDKrn. 


5884 


2 


1374 


| B FPGRR FRAVMEAGAGAG AGAAG WS C PGPGPT VTTLGS YEAS EG 
CERKKGQRWGSLERRGMQAMEGEVLLPALYEEEEEEEEEBEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEL 
EE TRELAGQHEDDS L ELQGLLEDERLASAQQAE VFTKQ IQQLQG 
ELRSLREE ISLLBHEKESBLKEI EQELHLAQAE IQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSSEPSGS LG LS D YSGLQ EELQELRER YHFLNEE YRA LQESNS 
SLTGQriADLESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 

EQRRLQRELKCAQWEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWISALLWCWWAETSS 


5885 


4251 


2522 


GVLARAS ARLRVPLTGVRACAK PE VGAE PAK VAGAAKPDEDGGR 
SRLRDCGDYTPSERLGPKGAMLWFQGAIPAAIATAKRSGAVFVV 
FVAGDDEQS TQMAAS WEDDKVT EAS SKS FVA I K IDT KS EACLQF 
SQ I YPWC VPSS FFIGDSGI PIjEVI AGS V5 ADELVTR I H KVRQM 

HLIiKSETSVANGSQS ESS VST PSAS FEPNNTCENSQSRNAELCE 
I P STS DTKSDTATGG ES AGHATS S QEPSG CSDQ R PAED LNI RVE 
RLTKKI.EERREEKRKEEEQRE2 KKEIERRKTGKEMLDYKRKQEB 
ELTKR^EERNRBKAEDRAARERIKQQIAIJDRAERAARFAKTKE 
EVEAAKAAALLAKQABMBVKRESYARERSTVARIQFRLPDGSSF 
TNQFPSDAPLEEARQFAAQTVGNTYGNFSLATMFPRREFTKEDY 
KKKLLDLELAPSASWLLP / ALFINF* AGRPTAS IVHSSSGDI W 

TLLGTVLYPFIAIWRLISNFLFSNPPPTQTSVRVTSSEPPNPAS 
SS^E^PVRKRVLEKRGDDFKKEGKIYRI^TQDDGEDENNTW 


b88 b 


900 


467 


AAUGGRRSRLSR^tyPTGPSKSPSGVRCU. \HH VAWEDKDEFLDV 

I YWFRQI IAWLGVI WGVLPLRGFLGIAGFCLINAGVLYLYFSN 

YLQIDEEEYGGTWELTKEGFMTSFA/IVHGHLDHLLHCHPL*!^ i 
VYSSQVLPIQSKGPS ^ | 


5B87 


86 


1341 


PFRGRALTLiUCQPRPGVAPPSLGTCHKSUPGkPAAgSOPPSPGS " 1 

GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 

EVLLEALFLTVDPYMRVAAXKLKEGDTMMGQQVAKWESKNVAL 

PKGTI VLASPGWTTHS I SDGKDLEKLLTEWPDT I PL8LALGTVG 

MPGLTAYFGLLEICX5VKGGETVMVNAAAGAVGSWGQIAKLKGC 

KWGAVGSDEKVAYLQKLGFDWFNYKTVESLBETLKKASPDGY 

DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 

PPEIGIYQELRMEAFWYRWQGDARQKALKDLLKWVLELPYFVI 

D * LQANTL VYKS MKS AKPS LE YI SE KLVSG\KIQYKEYI I EG FE 
NMPAAFMGMLKGDNLGKT I VKA 




193 7 


104 


>i t -^^K^L.KATKCPCRGPKWUSLGDEAARSPAAPGGAPGLLGLRE^- 

RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGIi D GLQGPP 

PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 

ACSVPWTGDSQFCSQKAVIYSLNFTANPPQRVFELVDQINPSI 

FCIHITNV*NLHYPLLIQKYL/NENNFDTLMKTSDGFTLNAESY 

VSFTTKLDlPTAAKYEYGVPLQTSDSFLRFPSSLTSSLCrDNNP 

AAFLVNQAVKCTRKINLEQCEEIEALSMAFYSSPEILRVPDSRK 

KVPITVQS I VIQSLNKTLTRREDTDVLQPTLVNAGHFS LCVNW 

LEVKYSLTYTDAGEVTKADLSFVLGTVSSVWPLQQKFEIHFLO 

ENTQPVPLSGNPGYWGLPIAAGFQPHKGSGIIQTTNRYGQLTI 

LHSTTEQDCLALEGVRTPVLFGYTMQSGCKLRLTGALPCQLVAO 

KVKS LLWGQG FPD YVAP FGNS QGP/ ADMLD WVP I HF I TQS FNRK 

DS CQ LiPGAL VI E VKWTKYGS LLNPQAKI VNVTANL I S S S FPEAN 

SGNERTILISTAVTFVDVSAPAEAGPRAPPAINARLPFNFFFPF 



394 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO; 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid* 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D«Aspartic Acid, B= 
plutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, ' 
\spossible nucleotide insertion) 


5888 


375 


2302 


llcrtpgvamqradseqpskrprCddsprtpsntpsaeadwspg 

LELHPDYKTWGPEQVCSFLRRGGFBBPVLLKNIRENEITGALLP 
CLDESRFENLGVSSLGBRKKLLSYIQRLVQIHVDTMKVINDPIH 
GH I ELHPLLVR 1 1 DTPQ FQRLR Y I KQLGGGY YVFPG AS HNRFEH 
S IiG VG YLAGCIiVHALGE KQPELQ IS 3RDVLC VQ I AGLCHDLGHG 
PFSHMFDGRFI PLARPEVKWTHECX5S VMMFEHLINSNG I KPVME 
QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 
I VSNKRNG1 DVDKWDYFARDCHHliG IQNNFDYKRFI KFARVCE V 
DNELRICARDKEVGNLYDMFHTRNSLHRRAYQHKVGNIIDTMIT 
DAFLKADDYIErTGAGGKKYRISTAIDDMEAYTKLTDWIFLEIL 
YSTDPKOKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKIjKAEDFIVDVINMDYGMQBKNPIDHVS 
FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\LIRVYCKKVDRKS 

lya\arqyfvqw\cadr\nft\kpqdgrcy*pptp*hpqkkgw\ 

NDSTFSPKIPTRIiPRRLPKSRV\QLFKDDPM 


5889 
5890 " 


1831 


731 


lpaacgrpvtarprqapegrsgrprdiopyppqvfpprpdrvai 
vtggtdgigystakhlarlgmhviiagnndskakqwskikeet 
lndket * vllccpgwlclwnssdppts asrgagttgvhhhfiilk 
fgifil\dlasmtsirqfvqkfkmkki?lhvlinnagvmmvpqr 

KTRDGFEEHFGLNYLGHFIiljTNLLLDTIiKESGSPGHSARWTVS 
SATHYVAELiNMDDXjQS sacys phaayaqsklalvl ftyhlqrll 
AABGSHVTAKWDPGXAmTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAWTS I YAAVTPEIiEGVGGRYLYNKKETKSLHVTYNQKLQQQ 
LWS KS CEMTGVLDVTL 


i 

i 


1322 


200 


FRRG WS AAGRAVP VAF CSR I S ASS PRRPRGAVRLQSGTEAACRS j 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCX3GTVGAI LTCP 
LEWKTRLQSSSVTLY ISE VQLNTMAGAS VNRVVSPGPLHCLXV 
I LE KEG PRS L FRGLGPNZi VGVAPSRAI YFAAYSNCKE KLND VFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\liVATTI 
AyPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I P \NTA IMMAT YEL W YLLNG [ 


5891 
5892 


1322 


200 


FRRGWS AAGRAVP VAFCSR I SAS S PRR PRGAVRLQSGTEAAORS 1 
GRPD PRPAS AAGGHAGERMSQRDTLVHLFAGGCGGT VG A 1 1 iTCP 
LEWKTRLQS S S VTL Y I S EVQ LNTMAGAS VN R WS PGPLHCLKV 
I LEKEG PRS IiFRGLGPNIjVGVAPSRAI YFAAYSNCKEKLNDVFD 
PDSTQVHMI SAAMAGFTAITATNPIWLI KTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSAS YAGI setvihfvi YES I 
KQKLLEYKTASTMENDEES VKEASDF VGMMLAAATS K\IjVATTI I 
AYPHEVVRTRLREEGTKYRS FFQTLSLL VQEEGYGS L YRGLTTH 
LVRQIP\NTAIMMATYELiWYLLNG 1 


5893 


17^4 


379 


VVliRVCGRLSVNSAVSSRTGGWSAGLTCAMQRLQWLGrtLRGPAj 
DSG WMPQ AAPCLS G APHASAADWWHGRRTAI CRAGRGGFKDT 
TPDELLSAVMTAVLKDVNLRPEQLGDICVGNVLQPGAGAIMARI 
AQFLSD I PET VP LSTVNRQ CS S GEjQAVAS IAGGI RNGS YDIGMA 
CGVESMS LADRGNPGNI TSRLMEKEKARDCL I PMG ITS ENVAER 
FGISREKQDTFALASQQKAARAQSKGCFQAEIVPVTTTVHDDKG 
TKRSITWQDEGIRPSTTMEGIAKLKPAFKKDGSTTAGNSSQVS 
DGAAA I LIiARRS KAEE LGLP I IiGVLRS YA WG VPPD I MG IGPAY 
AI PVALQKAGLTVSDVD I FEINE \AFASQAAYC VEKLRL PP * EG 
* TP LGGAS G P * GH PLG LHWGHVQ VI TLAQ * S * S ARG KRAYRSR r 
PCAIGSWNGS PLPVFE YPWGT 




3 


1653 

I 
I 
C 
I 

c 


IIjSKRRCQKAKTKELMAKKVAVIGAGVSGLISI*KCCVDEGLEPT~" 

wFERTEDIGGVWRFKJENVEDGRASIYQSVVTNTSKEMSCFSDFP 

viPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 

5FSSSGQ WKWTQSNGKEQSAVFDAVMVCSGHHI LPHI PL.KS FP 

5MER FKGQ YFHS RQ YKHPDGFEGKR I LVIGMGNIX3SD I AVE LSK 

7AAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 

CAVKWMIEQQMNRWFNHENYGLEPQNKYIMKEPVLNDDVPSRLL 

:gaikvkstvkeltetsaifedgtveenidvi IFATGYSFSFPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=* Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, X^Lysine, 

P= Proline, Q=Glut amine, R=Arginine, 
S -Serine, T=Threonine, Vc Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSLVKVENNMVSLYKYIFPAHLDKSTIACIGLIQPLGSIFPT 
AELQARWVTRVFKGLCSLPSERTMMMDIIKRNEKRIDLFGESQS 
QTLQTNYVDYLDEIJy^EIGAKPDFCSLLFKDPKLAVRLYFGPCN 
SY*YRLVGPGQWEGARNAIFTQKQRILKPLKTRALXDSSNFSVS 
FLLKILGLIiAVWAFF\ CQLQWS 


5894 


174 


1673 


RYSPKKVLQNKESSLKLGMATALVSAHSIiAPLNLKKEGLRWRE 
uti i :> iwisytrt AJjUCNSKGIjGQEPLCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLIILPKEIiQARVQEH 
HPESREDWWLEDLQLDLGETGQQVDPDQPKKQKILVEEMAPL 

kgvqeqqvrhecevtkpekekgeetriengkliwtdscgrves 

SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LI EHASTHTGKKLCESDVCQSS SLTGHKKVLS * ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGliliEHLR 
IHTGEKPYLCIHCX3KNFRRSSHLNRHQRIHSQEEPCBCKECGKT 
FS QALLLTHHQR I HS HS KS HQCNECG KAFS LTS DL I RHHRI HTG 
BKPFKCNI CQKAFRLNSHIAQHVRI HNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDXLA 


5895 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGB 
KRLFVSIXA^GCLP VLAAAGRARGRAEVL ISTVGPEDCVVPFLT 
RPKVP VLQLDS GNYLFSTSAI CRYFF\IiLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGXKG\EDVI/3SVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIV^WGALYPIiLQDPAYLPEELSALHSW 
FQTL S TQ \ E P CQR\ AARRL VLXQ \QG VLALR\ P YLQXQPQ PSPA 
EGKGLSP I E PE3EELATLS EEE I AMAVTAWE KGLES LP PLRPQQ 
NPVXiPVAGERKTVLITSALPYVKNVPHLGNI 1GCVLSADVFARYS 
RLRQWNTLYLCGTDE YGTATETKAL \EEGLTPQE ICDK YHI IHA 
DIY\RWF^ISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVtiQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRIiEEWLGRTL 
PGS DWTPNAQ F I TPFFG FREWP S KPRWQ*TRDL K\ WGNPGTP* B 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPBQVDLYQ 
FM \ AKDJJVP FHS L VFPS S ALGAEDN YT.L \ VSHL I ATE YLN YEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLLXNNS \ ELLNNLGNF INRA\ GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\lIVTLBLQnYIIQ\LLEKVRIRDALRSILTIS\RH 
GNQYI\QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
QP YMPT VS AT I QAQLQLP P PACS I LLTNFLCTLPAGHQ I GTVS P 
LFQKLENDQ I ES LRQRFGGGQAKTS PKPAWBTVTTAKPQQI QA 

LMDEVTKQGNI VRELKAQKADKNBVAASVAKLLDLKKQLAVAEG 
KPPEAPKGKKKX 


5896 


29617 


66 

< 
< 
] 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 

MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFIiT 

RPKVPVLQLDSGNYLFS TSAI CRYFF \LLSG WEQDDLTNQWLEW 

BATBLQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 

RQ\NCPFIiAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 

FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 

EGKGLSPIBPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 

NPVLPVAGERKVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 

RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 

DI Y \ RW FNI S FD I fgrtt tpqq\tki t\ QDI FQQLLKRGFVLQD 

T VEQLRC EHCAR F \ LADR FVEGVC P FCG YEEARGDQ CDKCG KL I 

NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLBEWLGRTL 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 

FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYliNYEDG 

K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 

FSWTDUlLKNNS\Et*L^3NLGNFrNRA\GMFVSKFFGG\YVPEMV 

LTPDDQRLLA\HVT LELQH YHQ\ LLEKVR I RDALRS I LT I S \ RH 

3NQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNrAALLSVML 

3PYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 

LFQKLENDQIESLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
( A = Ala nine , C=Cvsteine- DsAsnartic JXci <-? i?— 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=*Histidine, I^isoleucine, KoLysine, 
T.aLeucine, M=Methionine, N-Asparagine, 
• PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»valine, 
WaTryptophan, Y-Tyrosine, X-Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


29S7 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIKGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNTQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFIiAGETESIiADIVLWGALYPLLQDPAYIiPEELSALHSW 
FQTLSTQ \E PCQR \AARRLVLKQ\QGVLALR \ PYLQKQPQPS PA 
EGXGLS PIE PEEEELATLSEEE IAMAVTAWEXGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDE YGTATETKAL\EEGLTPQEI cdkyhi IIIA 
DIY\RWFNISFT)IFGRTTTPQQ\TKIT\QDIFQQI*LKRGFVLQD 
TVEQIxRC EHCAR F \ LADRFVEG VCP FCGYE EARGDQCDKCG KL I 
NAVE LKKPQCKVCRSCPVVQSSQHLFLDLPKIjEKR LEE WLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK WFYVWFDATIGYLS I TANYTDQWER WW\ KNPEQVDLYQ 
FM \ AKDNVP FHS LVFP S SALGAEDNYTL\ VSHLIATE YLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRS ILTIS\RH 
GNQYI \Q VNE P W \ KR I KGSEADRQRAGTVTGIiAVN I AALLSVML 
QP YMPTVS AT IQAQLQLPPPACS I LLTN FLCTLPAGHQ I GTVS P 
LFQKLENDQ I ESLRQRFGGGQAKTS PKPAWET VTTAKP QQIQA 

LMDSVTKQGNIVRELKAQKADKNEVAAEVAKTiIjDLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


hpsllgaipfypppsspwppplylfwnshrksrhfinqrgihge 
mrlfvsdgvpgclpviaaagrargraevzilstvgpedcvvpflt 

RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDI,TNQWLEW 
EATELQPTLSAALYYIi\WQGKKG\EDVLGSVRRTLTHIDHSLS 

rq \ncpfitagetes ladi vlwgaijypllqdpaylpeelsaiihsw 
fqtbstqXepcqrXaarrlvlkqXqgvialrVpylqkqpqpspa 

EGKGLSP I EPBEEBLATLSEEE IAMAVTAWEKGLESLPPLRPO.Q 

NPVLPVAGERNVLITSALPYVNNVPHLGNI I GCVLS ADVFAR YS 

RLRQWNTLYIiCGTDEYGTATETKAl*\EEGLTPQEICDKYHIIHA 

DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 

TVEQLRCEHCARF\LADRFVEGVCPFC!GYEEARGDQCDKCGKLI 

NAVELKKPQCKVCRSCPWQSSQRLFIjDLPKLEKRLEEWLGRTL 

t uuun a ck*f\\£r iiyr r »jrKi!#w.ro Jx«'KWQ w TRDIjK\WGWPGTP*E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 

FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 

K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 

FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 

LTPDDQRLLA\IlVTLELQHYHQ\LLEKVRIRDALRSIIiTIS\RH 

GNQYI\QVNEPW\KRIKGSEADROBAfVTVTV»T.2lVKiT2iaT t o\tmt 

QP YMPTVSAT I QAQ LQL P P PACS I LLTNFLCTLP AGEQ I GTVS P 

LFQKLENDQ IESLRQRFGGGQAKTS PKPAWETVTTAKPQQIQA 

LMDEVTKQGN I VRELKAQKADKNEVAAEVAKLLDLKKQIjAVAEG 

KPPEAPKGKKKK 


5899 
5900 " 


326 


1078 


NCPKS KE PNG vrap slps plraamalsd VD vkkqi khmmafi eq 

EANEKAEBIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKILMSTMRNQARIiKVLRARNDI*ISDLLSEAKIjRLSRIVEDP 
E VYQGLLDKLVLQGLLRLLE PVM I VR CR P \ QD LLLVEAAVQ KA I 

pbymtisqkhvev\qidkea*lavecswevwevysgnqrikvsn 

TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 




64 


1409 


KAASRDS P CLE FCPLCGVS SHDLQHRM WYHRLSHTjHSRI*QDLLK "" 

ggviypalpqpnfksllplavhwhhtasksltcawqqhedhfel 
kyantvmrfdyvwlrdhcrsascynskthqrsldtasvdlcikp 
kti rldettlfftwpdghvtkydlnwlvkns yegqkqkviqpr i 
lwnaeiyqqaqvpsvdcqsfletneglkkflqnfllygiafven 
vpptqehteklaerisliretiygrmwyftsdfsrgdtaytkla 
ldrhtdttyfqepcgiqvfhclkhegtggrtllvdgfyaaeqvl 
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ID 

NO: 


Predicted 
beginning 
nucleotide 

1 oca \~ i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H»Histidine, I-Isoleucine, K= Lysine. 
L=Leucine, M=Mftthionine, N=Asparagine, 
PsProline, Q=Glutamine, R^Arginine, 
S»Serine, T«Threonine, V«Valine, 
W«Tryptophan, Y«Tyrosine, X=*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPBE FELLS KSAI \KHEYIEDVGECHQPHDWDWAQS* ISTHG 
/ YKEL Y L I R YNN YDRAVINTVP YDVVHRW YTAHRTLTI ELRRPE 
NEFWVKLKPGRVLFIDNWRVLHGRECFTGYRQLCGCYLTRDDVL 
NTARLLGLQA 


5901 




2121 


VAI EQTSLKMKQAVGGAPAR PTGE YI CWQCGAKYTS LDSFQTHL"" 
KTHLDTVLPKLTCPQCNKEFPNQESLLKHVTIHFMITSTYYICE 
SCDKQFTSVDDLQKHLLDMHTFVFFRCTLCQEVFDSKVSIQLHL 
\ AVKHSNEKK V YRCTS CNWD FRNETDLQ LHVKHNHI»EN§GKVHK 
C I FCGES FGTEVELQCHI TTHS KK YNCKFCS KAFHAI I LLE KHL 
R EKHC VFETKTPNCGTMGAS EQVQ KEE VE LQTLLTNS QESHNS 1 1 
DGSEEDVDTSEPMYGCDICGAAYTMETLLQNHQLRDHNIRPGES 
AIVKKKAELIKG^KCNVCSRTFFSENGLREHMQTHLGPVKKYM 
CPI CGERFPSLLTLTEHKVTHSKSLDTGNCR I CKMPLQSEEE FL 
EHCQMKPDLRNSLTGFRCWCMQTVTSTLELKIHGTFHMQKTGN 
GSAVQTTGRGQHVQKLYKCASCLKEFRSKQDLVKLDINGLPYGL 
CAGCVNLSKSASPGINVPPGTNRPGLGQNEMLSAIEGKGKVGGL 
KTRCS*LATFKF*VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
QVS PMPR I S PSQSDE KKT YQCI KCQMVFYNE WD IQ VH VANHM I D 
EGU2JHE CKLCS QTFDS PAKLQCHL X EHS FEGMGGTFKCP VCFT V 
FVQANKLQQH I FS AHGQE DK I YD CTQC PQKF F FQTB LQNHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPS I RQSIGS TS VSRWLTSL FT YLDHTAD VQ * V* REF " 
IPLXPRQ* ED* MFQSWLHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VE P LQT VF. VETQGDDLQS LLFH FLDE WL YKFSADE FF I P \ G WGE 
EFSLSKHPQGTEVKAITYSAMQVYNEENPEVFVIIDI i 




2106 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT" 
PALFALSAVPGGAASPMPPSGLRLLPLLLPLLWLLVLTPGRPAA 
GLSTCKTIDMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGP 
LPEAVLALYNS TRDRVAG ESAEPE PE PEAD YYAKEVTR VLM VET 
HNEI YDKFKQSTHS I YMFFNTSELREAVPE PVLLSRAELRLLRL 
KLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGW 
RQWLSRGGE I EGFRLS AHCS CDSRDNTLQ VD I NG FTTGR \ RGDL 
ATIHGMNRPFLLLMATPLERAQHLQS\SRHRQAL\DTNY\CFSF 
HGGRNCLRC/VHC*HHFRKDL\GW\KW1 \HE\PKGYHANFC\L 
G PCPY I WSLDTQYSKVLALYNQ\HXPG\ASAAP \CCVPQALEP \ 
LPIVYY\VGRKPKVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEE I ENA INTFKEEQRL I YEEL I KEE KTTNNE LS AI S RK I DTW 
ALGNSETE KAFRAI SS KVP VDKVTPS TLPEE VLD FEKFLQQ TGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLAIiEERKKESIQIWKTKKQQKREEIFKLKEKADNTP 
VLFHNKQEDNQKQKE E QRKKQKLAVE AWKKQKS I EMSMKCAS QL 
KEEEEKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKEI 
RBKAEKAEKRKNAADE I SRFQERDLHKLELKI LDRQAKEDE KSQ 
KQRRLAKLKEKVENNVS RD PSRLY/ NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MAS FPPR VWEKE I VRLRTIGELIiAPAAPFDKKCGRENWTVAFAP "~ 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGRI KI WD VYTG KLLLNLVDHTG VVRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHONHVY\ 
SCAFSPDSSMLCSVGASKAVVAAILV*LRLCKHHSHT^aTMVT q 
WAE R VASLATG LGATFTI G* SNLAF VLQG VL YVHR CWSM STFCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNGFS VL FFG I LSDSRDI LRL* FNLKFVLI FF * K* CIVS VQK 
KKKPKRIALLQEERLS*DKPPSSHLI*QTEVNIRILFRAILHS* 
LLI FR I * NC 1 * T YS * 1 1 DP FYI QMT YDRG*FGKNKMVKF* F I EM 
* LY Y FHKI AFS FCNW*HPCCLPKKFHLAVN I L FACS I CFS S * A 
QVGD PSLL* TSDYLKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL*YLTLF3SVYFS*LVFGINGFQYSFWKLHCLYFMFRLI 
FKLT FNRNI *NR I CMSAL INLKTD FNLTM TLS I F FKLLI I YNA* 
JfNLN* I * QF* YKMCHFVLCMS B *S YNI CLFIAGF \ LWNMDK YTM 
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ID 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
HaHistidine, I»Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X=Unlcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRKLEGHHHD WACD FS PDGALLATAS YDTR VYI WD PHNGD I LM 
EFGHLFPP PT PI FAGGANDRW VRS VSFSHDGLHVASLADDKM VR 
FWR I DEDYP VQVAPLSNGLCCAFSTDGSVliAAGTHDGS VYFWAT 
PRQVPSLQHLCRMS IRRVMPTQEVQELP I PS KLLBFLS YRI 


5906 


14 6 


2038 


REGAGSGRMASGA\ YNP YI E 1 1 EQ P RQRGMRFR YKCEGRS AG S I 
PGEHSTDNNRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DL VG KDCRD \G YYEAE FGQE \ RRP \ LFFQN \LG I RC VKKKEVKE 
A\ IITR\ IKAGINPFDVP*KQLNDIEDCDLDVVRLWFRVFIfPDG 
HGNZj \ TTALPP V\ VSS P I YDNRAPNTAELR VCR VNKNCGS VRGG 
DE IFLLCDKVQKDD I EVR F VLNDWEAKG I FSQADVHRQVAI VFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 
KAKKQ KTTLLFQ KLCQDHVETG FRH VDQDGLEL LTSGDP PTLAS 
QSAGITVNFPERPRPGLLGSIGEGRYFKKEPNLFSHDAVVREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTN PLS S FS TRTL PSNSQG I PPFLRI PVGNDLNASNAC I YNN 
ADDIVGMEASSMPSADLYGISDPNMLSNCSVNMMTTSSDSMGET 
DNPRIiLSMNLEN PS CNS VLDPRDLRQ LHQMS S SSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SGIGS MQNEQLSDSFPYEFFQV 


5907 


99 


1873 


TYLLSSWSS * *NLDTKlKSQVK\//RKGHkKZS WPYPQPAKQNGK 
KATSKVPSAPHFVHPNDHANREAELKXKWVBEMREKQQAAREQE 
RQKRRT I ES Y CQD VLRRQE E FEHKE E VLQE LNM FPQBDDEATRK 
AYYKEFRKWE YSDVI LEVLDARDPLG CRCFQMEE AVLRAQGNK 
KLVLVLNKIDLVPKEWEKWLDYLRNELPTVAFKASTQHQVKNL 
NRCS VPVDQASESLLKS KACFGABNLMRVLGN YCRLGEVR THIR 
VG WGLPNVG KS SLINS LKRS RACS VGAVPG I TKFMQE VYLDKF 
IRLLDAPGIVPGPNSEVGT I LRNCVH VQKLADP VTP VETI LQRC 
NL EE I SNYYGVS G FQTTEH FLTAVAHRLGKKKKGGL YSQEQAAK 
AVLAD WSGK I S FYIPPPATHTLP THLSAE I VFCPMTPVPn T ptyp 
EQANEDTMECLATGESDELLGDTDPLEMEIKLLHSPMTKIADAI 
ENKTTVYKIGDLTGYC^PNRHQMGWAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMETDPIiQQGQAIiASALKNKKKMQKRADKIASKIj 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCGIKKRGEGSGSPSPASGGFQLGCQIP3PSLPSEEETHPHTRA 
HTRTLRATDTRRPPRSHSTRbRFPMPIiDGDGGLASWK/PMRER* 
GWRR PAKAAGAS LGVAATG KRGCRM5 KRYLQKATKGKLL 1 1 1 F I 
VTLWGKWSSANHHKAHHVKTGTCE VVAXjHRCCNKNKI EERS QT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKGWS CSSGNKVKTTRVTH 


5909 


1 


5002 


PAIPGSTI I WAPGSHSAARADGRHGSliPSQSQAPGAIiCGARAPP ' 

S SNLRADR SMI CAQARAGKNLYHNR FLGIiAAMAFPSRNSQS LRR 

CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 

STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMYLIDEVLS 

ENFLDYKNRGVNGSHRGQIIWKIDASSYFVEPETKICFKYYHGV 

SGALRATTP S VTVKNSAAP I FKS I GADET VQGQGS RRIj I S FS LS 

DFQAMGLKKGMFFNPDPYLKIS IQPGKHS I FPALPHHGQERRSK 

IIGNTVNPIWQAEQFSFVSLPTDVLEIEVKDKFAKSRPIIKRFL 

GKLSMPVQRLLERHAIGDR WS YTLGRRLPTDHVSGQLQFRFE I 

TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 

SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 

VSVGPEGAGELLAQVQKDIQPAPSAEEIiAEQLDLGEEASALIiLE 

DGEAPASTKEEPLEEEATTQSRAGREEBEKEQEEEGDVSTLEQG 

EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 

I HT LLHS M PS AQGGS AAEEEDGAE EESTLKDS S EKDGLS E VDTV 

AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 

HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 

SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 

EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 

SPEGLESPVAGPSNRREGECPILHNSQPVSOLPSLRPEHHHYPT 

IDEPLPPNWEARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 

RSGSIQQMEQLNRRYQNIQRTIATBRSEBDSGSQSCEQAPAGGG 
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1 SEQ 
ID 
I NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence • 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide • 
<A~Alanine, C=Cysteine, Dt»Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L-Leucine, M»Methionine, N=Asparagirie , | 
P=Proline, Q=Glut amine, R^Arginine, 
S«Serine, T«Threonine, v= Valine, | 
W^Tryptophan, Y=Tyrosine, X=Unknoum, ♦^Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


[ 5910 






GGGGSDSEAESSQSSLDLRREGSLSPVNSQKITLLI^SPAVKFIl 
TNPEFFTVLHANYSAYRVFTSSTCLKHMILKVRRDARNFERYQH 
NRDLVNF I NM FADTRLELPRG WE I KTDQQG KS FFVDHNS RATT F 

IDPRI PLQNGRLPNHLTHRQHLQRLRS YSAGEASEVSRNRGASL 
IAR PGHS L VAA I RSQHQHES LPLA YNDKI VAFLRQPN I FEMLQE 
RQPSLARNHTLREKIHYIRTEGNHGLE KLSCDADLVILLS LFEE 
EIMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 
YRRDFEAKLRNFYRKLEAKGFGQGPGKIKL1 IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSREFFFLLSQEIjFNP 
YYGLFBYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\LALI 
HQYIiLDAFFT\RPFYKALL\RLPC\D\LSDIiEYLDEEFHQSLQW 
MKDNWITDII*DLTFTVNEEVFGQVTERELiKSGGANTQVTEKNKK 
EYIERMVKWRVERGWQQTEAIiVRGFYEWDSRLVSVFDARELE 
IiVIAGTAEIDLNDWRNNTEYRGGYHDGHLVIRWFWAAVERFNNE 
QRLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRFLP*KKWGKITS 
LPPRG\HTGLQPDWDLPTVSPRTPMLYEK\LLTA\VRRT«Ttrr:'P 1 


5911 


1526 


446 1 VAE PAAMEPGRTQ I KLDPR YTADLLB VXjKTN YG I P S ACFSQP °T 
AAQLLRALQPVELALTSIIjTIiIjALGS iai fledavylykntlcp 
IKRRTLLWKSSAPTWSVLCCFGLWIPRSLVLVEMTITSPYAVC 
FYLLMLVMVEGPGGKEAVIjRTIjRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ \ R* C WALSNTPS * R * R * PWWACFSSPTASMTQQTFL 
RGAQIiYGSTLSSA/ CSTLLALWTLGI ISRQARLHLGEQNMGAKF 
ALFQVLL ILTALQPS I FSVLANGGQ IACS PPYSSKTR3QVMNCH 

LLILETFLMTVLTRMYYRRKDJIKVGYETFSSPDLDLNIjKALRWM 
AWTMKGCCTH 


5912 


109 


595 QaPIAPCIQGKGIiBMRSPKPQSFIIRSSHSGAGIjIiVKWPSTPVF 1 
OGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GT I S AHCNLRL PS S SNS PAPAS * LAG I TG VCHHAQL 1 F VFL VE T 
GFHHVGQAGLELL/NWIHIiPRPPKVLGLQA | 




924 


2 77 M I L»N KALM LGALAL TT VMS P CGGED I VADH VAS YG VNX» YQSYGP 1 
SGQ YSHE FDGDEE P YVDLER KET VWQL P LFRRFRR FD PQFAI/TN 
I AVLKHNLN I VI KRSNS TAATNE VPEVTVFS KSP VTLGQ PNTL I 
CLVDNIFPPVVNITWLSNGHSVTEGVSETRPSSPKSDHFrjjQDQ 
VTSPS FP FE* *DL*TAKVEQLGAWFEPLLKHWGAE IPTTIi 1 


5913 
5914 


46 


1198 OljRMAGAEGAAGRQSELE^PVVSLVDVliEEDEEIiENEACAVLGGS 
DS E KCS YS QG S VKRQAL YACS TCTPEGEE PAG ICLACS YECHGS 
HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCI CKRPYPDPEDEI PDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEKVCQACMKRCSFLWAYAAQLAVTKIST\GMMDWCGTLM 
B * /DDQEVI KPENGEHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGSSSESDLQTVFKNESLNAESKSGCKLQELKAKQLIKKDTAT 
YWPLNWRSKLCTCQDCMKMYGDLDVLFLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLSSMNRVQQVEL I C/G IQ * FED 


5915 


960 


124 iNUiCiSEliPPEEALFIQVASMNQRRVDFYLASIEDMIiVAl/GGRN 
EWGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHD YQ I G P YRKNLLC YDHRTD VWB ERR PMTTARGWHSM CSLGDS 
IYSIGGSDDNIESMERFDVLGVEAYSPQCNQWTRVAPLLHANSE 
SG VAVWEGR I YI LGG YS WENTTAFSKTVQ V YDREADKWSRGVDLP 

KAIAGGSACFIAP*SLGQRTRKRKAKARGTRTGASDPSCASWDH 
| PHRHLPGLCRPAATS 






703 


1 FPGRPTRPLKLGRRRKRARI IQAPHCHSPRPRTCPPGALQAPEA 
PASRAEGPVAVWNGHTEG PAPARS APKE P PGL P RPLG S FP CPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGL* * AAGPAAH 


. 5916 
1 5917 


256 
1343 


*33 

827 ~j 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPD3TK3T5GKGCR~ 
TVTG A VH RHLNHV AG 1 1 PWVLHSQLKPTAATAQDQWTSQQYPDH 
PTRLI LQ * NQATADKNN * TTALLQPHQRL \ VS PRMAEA 
AHQILTYLEP/ ICLWNYNKILTVFLTKSVliEl * KFIHTPQTYR — 
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to first 
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amino acid 
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Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=»Alanine, OCysteine, D^Aspartic Acid, Ba 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M«»Mefchionine, N»Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








?*NDFFGIKEVYVSRRLRKTSF/kLAVTFLEQAWSKECVPVDQ 
?ME HLL PS LLSLAS D PVPNVR VLLAKALRQMLLE KAY FRNAGN P 
HLE V I EET I LALQS DRDQDVS F FAALE P KRRNI I DTAVL BKQN 


5918 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPIiP 
PGPARRGRRRMETPFYGDEALSGLGGGASGSGGTFASPGRLFPG 
A?PTAAAGSMMKKDAIiTLSLSEQVAAALKPAPAPASYPPA\ADG 
APSAAP PDGLLAS PDLGLLKIiAS PELERL I IQSNGLVTTTPTS S 
Q FLYPKVAAS EEQEFAEGFVKALEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPGELAPAAAAPEAPVYA\NIjSSY\AGGCRGL 
RGG AAT \ VAF AAE P VP F PPPP P PGALG PRR P / RLALQGRR PQT V 

pdvp\sfgesp\plspiet\dtprri\kakrkrl\rnpqirapk 

PASRKL»GAQSRA1iERESEDPS * SPEHGSLASrASLLREQVAQLK 
QKVLSH VNS G CQLLPQHQVPAY 


5919 


1 


4254 


tsvqgdsqgtptssqgsinmehwisqaihgsttsttsssstqsg 
gsgaahriadvmaqthi enhsappdvttytsehsiqverpqgst 
gsrtapkygnaelmetgdgvpvssrvsakiqqlvntlkrpkrpp 
lreffvddfeellevqqpdpnqpkpegaqmlamrgeqlgwtnw 

PPSLEAALQRWGTISPKAPCXiTTMDTNGKPLYILTYGKLWTRSM 

kvays ilh klgtkqepmvrpgdrvaii v fpnnd paafmaafyg cl 
laewpvpievpltrkdagsqqigfllgscgvtvaltsdachkg 
lpkspix3e i pqfkgwpkllwfvtes khls kp prdwf\ phi kdan 

NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQALTC2ACGYTEAE 
TI VNVIiDFKKDVGL WHG I LTSVMNMMHV I S IPYSLMKVNPLSWI 
QKVCQ YKAXVACVXS RD MHWALVAHRDQRD I NLS S LRM L I VADG 
ANPWSISSCDAFLNVFQSKGLRQEVICPCASSPEALTVAIRRPT 
DDSNQ PPGRGVLS MHGLT YGVI RVDS EE KLS VLT VQDVGLVMPG 
AlMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGIiSGMTKNT 
FE V FAMTS SGAP I SE Y P F I RTGLLG FVG PGGLVF VVGKMDGLMV 
VSGRRHNADD I VATALA VEPMKF VYRGRIAVFS VTVLHD E R I VI 
VAEQR? DSTE EDS FQ WMS RVLQAI DS I HQVGVYCLALVPANTLP 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PEIGPAS VMVGNLVSGKR I AQASGRDLGQ IBDNDQARKFLFLSE 
VLQ WRAQTTPDHI L YTLLNCRGA IANSLTCVQLHKRABK I AVML 
MBRGHLQDGDHVALVYPPGIDLIAAFYGCLYAGCVPITVRPPHP 
QN I ATTLPTVKMI VEVSRS ACLMTTQL I CKLLR S R BAAAAVD VR 
T WPLXLDTDD * PKKRPAQI CKPCNPDTIAYLDFSVSTTGMLAGV 
KMSHAATSAFCRSIKLQCELYPSREVAICLDPYCGLGPVIiWCLC 
SVYSGHQSILIPPSELETNPALWLIAVSQYKVRDTFCSYSVMEL 
CTKGLGSQTESLKARGLDLSRVRTCVVVAEERPRIALTQSFSKL 
FKDLGLH PRAVSTSFGCRVNIAI CLQGTSGPD PTTVYVDMRAUR 
HDRVRLVERGS PHSIiPJLMESGKILPGVR 1 1 IANP ETKGPLGDSH 
USE IWVHSAHNAS G YFTI YGDESLQSDHFNSRLS FGDTQT I WAR 
TGYLGFLRRTELTDANGBRHDALYWGALDEAMELRGMRYHPID 
I ETS VI RAHKS VTECAVFTWTNLLVVVVELDGS EQEALDLVPLV 
T^^VVLEEHYLIVGVVVVVDIGVIPINSRGEKQRMHLRDGFLADQ 
LDPIYVAYNM 


5920 


1381 


1499 


QI^AVAHAGVSRIPP*LFPPLHPTFLSIiWCIjHHKLP/HPPGASM 
VRPP WPRRPPAH 1 SS VRQAS TQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCS^AAHSTYRVQEPAVHIPGQEPLTASM 
IiAAAPLHEQKQMIGERLYPLIHDVHTQLAGKITGMLLElDNSEI* 
LLMLKS PESLHAKI DEAVAVLQAHQAMEQPKAYMH 


5921 
5922 


727 
2475 


157 

< 

i 

495 i 


VCPG IGGE *GLWGQLGGI>PKETPLKPMDAFTGSGLKRKFDDVI>V — 
GSSVSNSDDElSSSDSADSCDSIiNPPTTASFTPTSILKRQKQLR 
RXNVR FDQ VTVY YFARRQG FTS VPSQGOS3 LGMAQRHNS VRS YT 
LCEFAQEQEVNHREILREHLKEEKLHAKKMKLTKNGTVESVEAD 
SLTLiDD VS DEOI DVENVE VDDYF FLQPI* PTKRRRALLRASG VHR 
IDAEEKQELRAIRLSREECGCD CRLYCDPEACACSQAG I KCQVD 
RMSFPCGCSRDGCGNMAGRIEFNPIRVRTHYLHTIMKLELESKR 
3\GAAQGPQ\ *GALPDCQLQPDRSTGL* DPS WIGSKGLS FTGKG 
^AATHLI ILRVTENRGAEGKRK 

3 YSNWGLFPS VFIQ VPRSRTGNLKPXFIj FYS YYE \CMETLKGXT~ 
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ID 
NO: 


| Predicted 

beginning 
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I location 

corresponding 

to first 
1 amino acid 

residue of 

amino acid 
J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine / C=Cysteine, D=Aspartic Acid') E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y-Tyrbsine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
[ Vpossible nucleotide insertion) 








CL YNATQ YKVCS PRNDR PD ACYN PSE PAATTV FE IRTGLLLGDT 
S K 1 1 TRTE E KE I P KQ I TLRFDACAAI NS KKLE IGCG S LN * ERS * 
RVENKYVCHE SG VCKWCAYWPCVI *AT*KKNKNDSVYLQKGEAN 
PS CAAGH CNPLEI/I I TNPLDPH WKKGER VTLG INRTGLKPQ Wl 

LIKGEVHKCSPKPVFOTFVFPT.TJT.PaDT?T rv^PVKTrcr/tT 

I FLLNGTS C YVRGGTT IGDR WP WE A * EL VPTD PAPD 1 1 P I * KAE 
ASNF* VXiKTS I IRQYCIAREGKDFI I PVGKPNCIGQKLYNSTTK 
TIT** DLNHTE KNPFS KFS KLKTA* AHAES H * DWTV PSGL Y * I C 
RHRAYFRLPNKWADSCVTGTI KPS FFLLPI KMGELLGFS VYASR 
E KKG I VIGNW KDNE W P RERI IQ Y YG PATWAQDGS WGYR / TP / VY 
MLNW 1 1 RLQ A I LE 1 1 SNETGRALT VLAWQE TQMRNA I YQNRLALi 
DYLLVAEX3GVCRKFWLTNCCLQINDQGQVVKNIVRDMTKLAHVP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 
PLL FQM I KG I VATL VHQKTS AHVNYMNHYR SIS QRDS KS EDES E 
NSH 


5923 
5924 


137 


63o | UbQJGRRGQRFRTSiKRMHPI*RTCPNTNL/lILl7SQEWTQIR0ir- 
QGBNREL WI S IiEEHQDALEL IMS KYRKQMLQLMVAKKAVDAE PV 
I j IiKAHQSHSAEIESQIDRICEMGBVMRKAVQVDDDQPCKIQEKIiA 
QLELENKELREIiLSISSESLQARKENSMDTASQAIK 




274 


J 2146 EKGKVKDAGAEQWISIjSLSCKOSWETQPSNHLNSLTPPTSVRRM 

j p^ittvtllkmvarhhkkllcskafstqlqqkiflhsqmgihhq 

j SVCMKLKPNTSH1 ISILMGQPMALVQIiETLiAPIiTII IQKFQTQD 

HMKFWKNLPLHSHHLTPSVPQTVIPKKTGSPBIKLKITKTIQNG 

RELFESSLCX3DLLNEVQASE\Q*NQSIESRKEKRKKSNKKDSSR 

SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVLKEGSPSS 
1 ANTIFCSNNGSVHW\FKFOVGnijVWQlfVf5TVut(JCJDr , iun7C5cnTi/\T 

EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REERlEQYTFIYIDKQPEEALSQAKKSVASKTEVKKTRRPRSVIi 
NTQPEQTNAGEVASSIiSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAARKS 1.PAS I TMHKGS LDLQKCNMS P WKI EQVFALQNATG 
DGKFI DQ FVYSTKG I GNKTE I S VRGQDRL I I ST PNQRNEKPTQS 
VSSPEATSGSTGS VEKKQQRRS IRTRSESEKSTEWPKKKIKKE 
QVGFLHVES 


5925 | 
5926 


216 


1911 MMTAESREATGLS PQAAQEKDGI VI VKVEEEDEEDHMWGQDSTIi 
QDTPPPDPE IFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 
INTKEQlLEDLVLEQFIiSILPKELQVWLQEYRPDSGEEAVTLLE 
DLEIiDLSGQQVPGQVHGPEMLARGMVPLDPVQESSSFDLHHEAT 
QSHFKHSSRKPRLLQSRALPAAHIPAPPHEGSPRDQAMASALFT 
ADSQAMVKI 3DMAVSLI LEEWGCQNLARRNLSRDNRQENYGSAF 
PQGGENRNENEESTS KAETSEDSASRGETTGRSQKEFGEKRDQE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGERGPREKGK 
GLGRS FS LS SNFTT P E E VPTGTKSHRCDE CGKCFTRS S S L IRHK 
I IHTGEKP YECSECGKAF\SIiNS \NLVLHQRI \ HTGE KPHECNE 
CGKAFSHSSNbILHQRIHSGEKPYECNECGKAFSGSSD\LTKHQ 
R IHTGEKP YECSECGKAFNRNS YLI LHRR VHTREKP YKCTKCGK 
j \AFTRSSTLTLHHRIHARERASEYSPASLDAFGAFLKSCV 


5927 


2 


233 1 UKCLMLKQGSQPGSPPAr/CEPPAPPVYQAPCQSCPEPPGAHEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 




414£ 


1248 ^fskfgsqalyqlkrpasgqnsisvmpaqkitkpaakygTpla~" 

YKKYGDKKIiHEKKPLQKHKQAHQTPEKRVNTGEERR KI 5EEAAR 
KRRLEFIEKEKKQKDQI I SLMKAEQMKRQEKERLERINRAREQG 

wrnvlsaggsgevkapflgsggtiapssfssrgqyehyhaifdq 

MQQQRAEDNEAK WKRE I YGRGL PERQ KGQLAVERAKQVEE FLQR 

k^eamqnkaraeghkgilqniaamyggrpsssrggkprnkeeev 

YLARLRQIRLQNFNERQQIKAKIiRGEKKEANHSEGQEGSBEADM 
RRKK\ IESLKAHANARAAVT,KEOLERKRKEAYBREKKVWEEHLV 
j AKGVKSSDVS P PIXJQHETGGSPS KQQMRS VISVTSALKB VG VDS 

sltdtretseemqktnnaisskreilrrlnenlkaqedekgkqn 
lsdtfeinvhedakehekeksvssdrkkweaggqlvipldeltl 

j DTS FSTTERHTVGE VIKXGPNGS PRRAWGKS PTDSVI.K ILGEAE 
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sequence 



5928 



4146 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1248 



5929 



5930" 



113 



1558 



6082 



Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, B=* 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
HaHistidine, I«lsoleucine, K=Lysine, 
L= Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glut amine, R~Arginine, 
S=Serine, ^Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\^po3sible nucleotide inse rtion) 

LQLQTSLLENTT1RSEJ5PEGSKYKPLITGEKKVQ CISHEINPS " 
A I VDS P VE7KS P E FS E AS PQMS LKL EGNLEE PDDL E TE I LQE PS 

GTNKDE\SLPCTXTDVWrSEEKETKBTQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKyVHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGIiSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDEN I KEG PS DS ED I VFEETDTDLQE LQASMEQLLRBQ PG EE Y S 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GE I AS ECE CDS VFNHLEE LRLKLEQEMG FEKFFE VYE K I KAIHE 
DEDENIEICSKIVCNILGNEHQHLYAKILHL VMADGAYQEDNDE 
KHFSKFGSQALYQLKRPASGQNSISVM PAQKITKPAAKYGIPXjr 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNE AK WKR E I YGRGL PE R QKGQLAVERAKQVE E FLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEEEV 
YLARLRQIRLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKSSDVSPPLGQHETGGSPSKQQMRSVISVTSALKEVGVDS 
SLTDTRETSEEMQKTNNAI SSKREI LRRLNENLKAQEDEKGKQN 
LSDTFEINVHEDAKEHEKEKSVSSDRKKWEAGGQIiVIPLDELTL 
DTSFSTTERHTVGEVIKLGPNGSPRRAWGKSPTDSVIiKILGEAB 
LQLQTELLENTT I RSE I S PEGEKYKPL I TGE K K VQCI S HE I NPS 

AIVDSPVETKSPEFSEASPQMSIjKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCTITDVW1SEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWHSE 

hlnlvpqvqsvqcspeesfafrshshlppknknknsu.iglstg 

L FDANNP KMLRTCS LPDLS KLFRTLMDVPTVGD VRQDNLE I DE I 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 

geiasececdsvfnhleelrlhleqemgfekffevyekikaihe 

D3DENIEICS K I VQNT LGNEHQHt* YAK I LHL VMADGA YQEDNDE 



— "-^^-l vw»iuo»ajmniiiM i. uHIj VMADGA YQEDNDE 

LDFSMTTQLPA Y VA I LLF Y VS RAS CQDT FTAA V ¥ EHAAI LPNAT 
LTPVSREEALALMNRNLDILEGAITSAADQGAHI IVTPEDAIYG 
WNFNRDSLYPYLEDIPDPEVNWIPCNNRNRFGQTPVQERLSCLX 
AKNNSIYWANIGDKKPCDTSDPQCPPDGRYQYNTDWFNDSQG 
KLVARYHKQNLFMGENQFNVP KE PE I VT FNTTFGS FG I FTCFD I 
LFHDPAVTLVKD FHVDTI VFPTAWMNVLPHLSAVEFHS AWAMGM 
RVNFLASN IHYPS KKMTGSG I YAPNSSRAFH YDMKTEEGKLLLS 
QLDSHPSHSA WNWTS YASS I EALSSGNKE FKGTVFFDEFTFVK 
LTG VAGNYTVCQ KDLCCHLS YKMS EN I PNE VYALGAFDGLHTVE 
GR YYLQ I CTLLKCKTTNLNTCGDS AETASTR FEMFS IjSGTFGTQ 

YVFPEVLLSENQLAPGEFQVSTDGRLFSIiKPTSGPVLTVTLFGR 
LYE KDWASNAS SGL^AQAR I IMLI VIAPI VCSLSW 



1 w MTwifhjoyj^ - nyfttt a ^mijx vXAPI VCSL5W 

RGNCFWIVPFTMAQRTGLEDPERYL FVDRAVIYNPATQADWTAK * ~ 
KLVWIPSERHGFEAAS IKEERGDEVMVBLAENGKKAMVNKDDIQ 
KiWPPKFS JCVEDMAELTCLNEAS VLHNLKDRYYSGL I YTYSGLF 
CWINPYKNLPIYSEWIIEMYRGKKRHEMPPHIYAISBSAYRCM 
LQDREDQS I LCTGESGAGKTENTKKVIQ YLAH VASSH KGR KDHN 
I PG E \ LE RQLLQANP I L ES FGNARTVQNDNS S RFGKF I R INFD V 
TOY I VGAN I E T YLLEKS RAVRQAKDERTFHI F YQLLS G \ AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAf/JHIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPBNTVAQKL 
CHLLGMNVMEFTRA I LT PR I KVGRD YVQ KAQTKEQAD FAVEALA 
KAT YERL FR WLVHR I NKALDRTKRQGAS FI G I LDI AG FE I FE LN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQP CI DL I ERPANP PGVLALLDE E CWFP KATDKTFVEKL VQEO 

GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQS SDR F VAELWKDVDR I VGLDQVTGMTETAFG S A Y KT 
KKGM FRTVGQL YKES LTKLMATL RNTNPN FVRC 1 1 PNHEKRAGK 
LDPHLVIiDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 
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ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



"5931 



113 



Amino acid segment containing signal peptide" 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E. 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=*Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, MsMethionine, N^Asparagine, 
PnProline, Q-Glut amine, R^Arginine, 
S=Serine, ^Threonine, V=Valine, 
W*Tryptophan, Y^Tyrosine, X=Unknown, '-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide inse rt i on ) 
MAIPKGFMDGKQACERMIR ALljtDPI^YRI^SKIFFRAG VI^r 
LKKKRDLKITDIIIFFQAVCRGYLARKAFAKKQQQLSADKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEBELQAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETBLFAEAEEM 
RARLAAKKQELEEILHDLBSRVEEEEERNQILQNEKKKMQAHIQ 
DLEEQLDEE EGARQ KLQLEKVTAEAKI KKWEEE ILL LE D QNS KF 
I KE KKLMEDR I AE CS S QLAEE E E KAKNLAK1 RNKQEVM I S DL EE 
RLKKEEKTROEIiRPfaifPK-T^rMr'C'wnr.nriATi.BT 




6082 



EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQnMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
QELHAKVSEGDRLRVELAEKASKLQNEUDNVSTLLEEAEKKGIK 
FAKDAAS LES QLQDTQELLQEETRQKLNLS S R I RQLEE E KNSLQ 
EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA 
KKKLLKDAEALSQRLEBKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLBKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KETKALSLARALEEALEAKEEFERQNKQLRADMEDLMSSiQODVG 
KNVHEI*EKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELEAELEDERKQ 
RALAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 
DYQRELEEARASRDE I FAQS KESEKKLKSLEAE I LQLQEELAS S 
E RARRHAEQERDELA0E I TNS ASGKS ALLDE KRRLEARI AQLEE 
ELEEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIC3QLEEQLE 
QEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
NARMKQLKRQLBEAEEEATRANASRRKLQRELDDATEANEGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 

KWM Utf W I VPFTMAQRTGLEDPER YLF VDRAV1 YN PATQAD^TAK 
KLVW IPS ERHGFBAAS I KEERGDEVMVELAENGKKAMVNKDD I Q 
KMNPPKFSKVEDMAELTCLNEASVLKNLKDRYYSGLIYTYSGLF 
CVVXNPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREIX5SILCTGESGAGKTENTKKVIQYLAHVASSHXGRKDHN 
IPGE\LERQLIjQANP I LES FGNARTVQNDNS SRFGKFIRINFDV 

TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 

ksdlllegfnnyrflsngyipipgqXqdkgnfrgdpgeamhimg 

FSHEEILSMLKVVSSVLQFGNISFKKERNTDQASMPEWTVAQKL 

chli^nvmeftrailtprikvgrdyvqkaqtkeqadfaveala 

KAT YERLFRWLVHR INKALDRTKRQGAS FIGILD I AGFE I FELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
! G SHSKFQKPRQLKDKADFCI IHYAGKVDYKADSWIiMKNMDPLND 
NVATLLHQS SDRFVAE LWKDVDR IVGLDQVTGMTETAFGS AYKT 
KKGM FRTVG QL YKES LTKLMATLRNTNPNFVRC 1 1 PNHEKRAG K 
LDPHL VLDQLRCNGVLEG IR I CRQG FPNR I VFQEFRQR YE ILTP 
I NAI PKGFMDGKQACERM I RALELDPNLYR IGQSKI FFRAGVLAH 
! LEEERDLKITDI II FFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAA YLFQjRHWQWWR VFTKVKPLLQ VTRQE E ELQAKDEELL KVK 
EKQTKVEGELEEMERKIIQQLLEBKNILAEQLQAETELFAEAEEM 
RARLAAIOCQBLEEILHDLESRVEEEEERNOIIiQNEKKKMQAHlQ 
I ^kEEQLDEEEGARQKLQLEKVTAEAXIKKMEEEILLLEDQNSKF 
, XKBKKLKEDRIAECSSQLAEEEEKAXNLAKIRNKQEVMISDLEE 
1 RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 
AKKEEEIjG^ALAJ^GDDBTIjHKNNALKVVREIiQAQIAELQEDFES 

I eka s rnkae kqkrdlsbelealkteledtldttaaqqelrtkre 
qevaelkkaleeetknheaqiqdmrqrhataleelseqleqakr 
! fkanleknkqgletdnkelacevkvlqqvkaesehkrkkldaqv 
I Qelhaxvsegdrlrvelaekasklqneldnvstlleeaekkgik 
fakdaaslesqlqdtqellqeetrqklnlssrirqleeeknslq 

ECK3BEEEBA^KNLEKQVIALQSQIJuOTKKKV^ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
vA=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F^Phenyl alanine, G*»Glycine, 
H=Histidine, I=»Isoleucine # K=» Lysine, 
L»Leucine, M-Mcthionine, N=*Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V= Valine, 
WaTryptophan, Y=Tyrosine / X=Unknown , *=stop 
codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KETKALSLARAIiEBALEAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHEL£KSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQAMKAQFERDLQTRDEQNEEKKRLL I KQVRBLBAELEDERKQ 
RAIAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 
DYQRELEEARASRDE I FAQSKESEKKLKSLEABILQLQE3LASS 
E RARRHAEQ E RDELADE I TNS AS G KS ALLDEKRRLEAR I AQLBE 
E L EEEQS NME LLNDRFRKTTLQVDTLNAELAAE RS AAQ KSDNAR 
QQLERQNKEL KAKLQEL EG AVKS KFKAT I S ALE AKI GQLEEQLE 
QEAKERAAANKLVRRTEKKLKEI FMQVEDERRHADQYKEQMEKA 
KARMKQLKRQLE E A EEEATRANASRRKLQ RE LDDATE ANEGLS R 
EVSTLKNRLRRGGFISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 
" 5933 


33 


572 


RHLEE I C FLFLQKGRKLKLSGPR WEEG KP RGTGGLW VKAEANMG 
FGATLAVGLTI FVLS WTI I ICFTCSCCCLYKTCRRPRP V\APP 
PHPP/PVVHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYP PP YPAQPMGP PAYHETLAGGAAAPYPASQPP YNPAYMDA 
PKAAL 




1 


3190 


GTRKLKMADKTPGGSQKASSKTRS^DVHSSGSSDAHMDASdPSD ' 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEKETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNXiKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKIiGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAI VKVVI PTERNLLALI 
HRM I E FWREGPMFEAMX MNRE I NNPMFRFL FENQTPAHVYYRW 
KLYS I LQGDS PTKWRTBDFRMFKNGS FWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKBEQRDKLEE ILRGLTPRKNDIGDAMVFC 
LNNAEAAEEI VDCI TESLS ILKTPLPKKIARLYLVSDVLYNSSA 
KVANAS YYRKFFETKLCQ I FSDLNATYRTIQGHLQ5ENFKQRVM 
TCFRAWED WA I Y PE PFXj I KL QNI FLGL VNI I EEKETKD VPDDLD 
GAPI EEELDGAPLEDVDG I PIDATPI DDLDGVP I KSLDDDLDGV 
PLDATEDSKKNEPX FKVAPS KWEAVDE S ELEAQAVTTSKWE L FD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHKLYSNPIKBEMTE 
S KFS K YS E MS EE KRAKLRE I ELKVMKFQDELESGKR PKKPGQS F 
QEQVBHYRDKLLQREKEKELERERERDKKDKEKLE5RSKDKKEK 
D ECTPTRKERKRRHSTS PS PS RSSS G RRVKS PS P KS E RS ERS ER 
S HKE S S RS RS SHKDS PRDVS KKAKRS PSGSRT PKRS RRSRS RS P 
KKSGiOCSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKBDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLS R FE P PQS DSDGQRRS MDAPS RRNRSSG VL 
!JUxAPiai>iiUvC3DPSTr\NFYLGNI \NPQMNLKKCCCQEFGRFGP 
IASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MS EEMKLGWGKAVP I PPHP I YI PPSMMEHTLPP PPSGIjPFNAQP 
RERLKNPNAPriLPPPKmCEDFEKTLSQAIVKWIPTERNIiLALI 
HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 
KLYS ILQGDS PTKWRTEDFRMFKNGSFWRP PPLNPYIiHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAEAAEE X VDCI TESLS I LKTPLP KK I ARLYLVSDVL YNSSA 
KVANAS Y YRKFFETKLCQ I FS DLNAT YRT I QGHLQS ENFKQR VM 
TCFRAWEDWAI YPEPFLIKLQNI FLGLVNI IEEKBTEDVPDDLD 
3API EEELDGAPLEDVDGX P IDATPIDDLDGVPIKSLDDDLDGV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q«Glut amine, R»Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=T*yrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDSKKNEP I FKVAPS KWEAVDE S ELE AQAVTTS KWELFD 
QHEES E EE ENQ NQE E E S EDEEDTQS S KS EEHHLYSN P I KEEMT E 
S KFS KY S EMS E E KRAKLRE I ELKVMKFQDELESGKRPKKPGQS F 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPS RSSSGRRVKSPS PKS ERS ERSER 
SHKESSR5R5SHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5935 


3 


4493 

• 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 
■ SDS GS FVS SRARRE KKS KKGRQEALE R LKKAKAGER YK YE VED F 
TGV YEEVDEEQ YSKLVQARQDDDW IVDDDGIGYVEDGRE I FDDD 
LEDDALDADE KGKDG KARNKD KRNVKKLAVTKPNN I KS M F I ACA 
GKKTADKAVDLS KDGLLGDI LQDLNTETPQITPPPVM ILKKKRS 
IGAS PNPFS VHTATAVPSGKIASPVSRKE PPLTPVPLKRAEFAG 
DDVQVES TEE EQESGAMEFEDGDFDE PME VEEVDLE PMAAKAWO 
KESEPAEEVKQSADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQVDS SHLPLVKGADE EQVFHF YWLDAYEDQ YNQ PGWFLF 
GKVWIESAETHVSCCVMVKNrERTIiYFIiPREMKIDLNTGKETGT 
EISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKS 
EYLEVKYSAEMPOLPQDLKGETFSHVFGTNTSSIjELFLMWRKIK 
GP CWXjE VKKS TALNQP VS WCKVEAMAL KPDLVNV I KDVS PP P LV 
VMAFSMKTMQNAKNHQNE I IAMAALVHHSFALDKAAPKPPFQSH 
FCWSKPKDC I FP YAFKE VIEKKNVKVEVAATERTLLGFFLAKV 
HKI DPDI I VGHNI YGFELEVLLQRINVCKAPHWSKIGRLKRSNM 
PKLGGRSGFGERNATCGRM I CDVE I SAKEL1RCKS YHLSELVQQ 
ILKTERWIPMENIQNMYSESSQLLYLLEHTWKDA\KFILQIMC 
ELNVLPLALQI TN I AGNI MSRTLMGGRSERNE FLLLHAF YENN Y 
I VPD KQ I FRKPQQ KLGDEDEE XDGDTNKYKKGRKKGAYAGG LVL 
DPKVGFYDKFI I*LLDFNSLYPSI IQEFNICFTTVQRVASEAQKV 
TEDGEQEQIPELPDPSLEMGICPREIRKIiVERRKQVKQLMKQQD 
LNPDLIIiQYD I RQKALKLTANSMYGCLGFS YSRFYAKPLAALVT 
YKGRE I LMHTKEMVQKMNIiE VI YGDTDS IMINTNSTNL.EBVFKL 
GNKVKSE VNKL YKLLE I DIDGVFKSLLI*LKKKKYAALWEPTSD 
GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGQILSDQSRDTIV 
ENIQKRLISIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 
HVHVAIjWINSQGGRKVKAGDTVSYVICQDGSNLTASQRAYAPEQ 
LQKQDNIiTIDTQ YYIiAQQI HPWAR I CEP I DG I DAVL I ATGWE L 
\DPTQFKVHHYHKDEENDALU3GPAQLTDEEKYRDCERFKCPCP 
TCGTENI YDNVFDGSGTDME PSLYRCSNI DCKAS PLTFTVQLSN 
KL XMD I RR FI KKY YDGWL I CEEPTCRNRTRHLPLQFSRTG P I#CP 
ACMKATLQP E YS D KS L YTQLCF YRY I FDAECALE KLTTDHE KDK 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLSKLFAGCAV 
KS 


S936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRIiLRAVHRSRAWTCYLAI 
RMLMATCCPSPTTTACTGPWQRAPPLRLLVQKREADSSGLAFAS 
NSLGRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPLG FYXRDGMS VRVAPQG \LERVPG I FI 
SRLVRGGLAESTGLLAVSDEI LE VNG I BVAG KTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNVVRGASGRIiTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDIiHPGCRHPGTRSSLPSLDD 


5937 


31 


1600 


PTS LLKS T VQLMCRLIjQ DKR YQC V YSLAE I FKVIiAS F Y V I LVI L 
YGLTSS YSLWWMLRSS LKQYS FEALREKSNYSDIPDVKMDFAFI 
LHLADQYDPb YSKRFS I FLSE VSENKLKQINLNNBWTVEKLKSK 
LVKNAQDKIELHLFMLNGLPDNVFELTEMEVLSLELIPEVKLPS 
AVSQLVNLKELRVYHSS LWDHPALAFLEENLKI LRLKFTEMGK 
I PRWVFHLKNLKELYIjS GCVLPEQLSTMQLEGFQDLKNI^RTLYL 
KSSLSRIPQWTDLLPSI^KIjSLDNEGSKLVVIiNNLKKMVNLKS 
LELI SCDLERI PHS I FSLNNLHELDLRENNLKTVEE 1 1 S FQHLQ 
NLSCLKLWHNNIAYI PAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKI*HYIiDLS YNHLTFr PEEIQY1,\SNLQ YFAVTNNNI EMLPDG 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine . CsCvsteine n^nQnarhir ziri a 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=*Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutatnine , R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
V7-Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLiIjIjGKNSLMNLS PHVGELSNLTHREPIG \i* Vleti7~" 
P PELEGCQS LKRNCL I VE ENLLNTL PLP VTERtiQTCIiD KC 


5938 


395 


186^ 


YKGEGFFCWQEARGEl^KKKKAMSSPNIWSTGSSVYSTPVFSQK 
MT VWI LLLLS L YPG FTSOKSDDD YED VAS M WVWUT 'VDmr-DTmrwr 

T VI LNN LLEGY DNKLRPDIG VKP TLI HTDM YVNS IGP VNAINME 
YTIDI FFAQTW YDRRLKFNST I KVL.RLNSNMVGKIWI PDTFFRN 
S KKADAHW I TT PNRMLR I WNDGR VL YSLRLTI DAECQLQLHN PP 
MDBHS CPLEFSS YG YPR E EI VYQ WKRSS VE VGDTRS WRL YQFS F 
VGLRNTTEWKTTSGDYWMSVYFDLSRJRMGYFTIQTYIPCTLI 
WLSWVSFWINIG5AVPARTSI^ITTVI.TMTTLSTIARKSLPKVS 
YVTAMDLFVS VCFI FVFSALVE YG \TLIIYFVSNRKPS KDKDKKK 
KNPAPTIDIRPRSATIQMNNATHLQERDEEYGYECLDGKDCASF 
FCCFELKIRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNIjVYW 
VSYLYL 


5939 


66 


1404 


■ 1Ivrvj * vyAnDrutiKAbLci^'r r Ur L Voir»GSRIiNICDNDTL»KD 
LLKANVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGI.LGVSIR 

fcsfix;anejwwhvlevesnspaaiaglrphsdyiigadtvmne 
sbdlfslietheakplklyvyntdtdncrev 1 1 tpnsawggegs 
lgcgigygyliiriptrpfeegkkislpgqmagtpitplkdgfte 

* rk3 uorr\j j. i\?x£»iioJu itji-t^i J.i»STP\FAVSSVL»STGV 
PTVP\LLPPQVNQSI.TSVPPMESSYIiHLPGLMPFTRQGLPNI>PQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLiPPLSSMPPRNLPG\I 
API.PLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS S LTVDVTPPTAKAPTT VEDRVGDS TP VSE KP VSAA 
VD AN AS ESP 


5940 


145 


717 


RRSAS RSAS PRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLPVHMGLVITEVEQEPSFSDIASLWWCMAVGISYISVYDH 
QGI FKRJ^SRI^DEILKOOQELLGIiDCSKYSPEFANSNI)KDI)QV 
LNCHIjAVKVLS PEDGKAD I VRAAQDFCQLVAQKQKRPTDIiDVDT 
LA\VYLVQMWli I LI 


5941 


13 


6147 

< 

3 
J 
1 

\ 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSLLAVVVLI^PVA 

WGQCNA?EW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

S 1 1 CL» KNS VWTGAKDRCRRKS CRNPPDPVNGMVHVIKG IQFGSQ 

IKYSCTKGYRI.IGSSSATCIISGLrrVIWDNETPICDRIPCGI.PP 

TITNGDFISTNREWFHYGSVVrYRCNPGSGGRICVFEiiVGEPSIY 

CTSNDDQVGI WSGPAPQCI I PNKCTPPNVENGI LVSDNRSIjFSIj 

NEWEFRCQPGFVMKGPRRVKCQALNKWEPELPSCSRVCQPPPD 

VLHAERTQRDKDNFSPGQEVFYSCEPGYDLRGAASMRCrPQGDW 

S PAAPTCEVKS CDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEG F 

QLKGSSAS YC VLAGMESLWNSS VPVCEQI FCPS PPVI PNGRHTG 

KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASDFPIGTSIiKYE 

CRPEYYGRPFSITCLDNLWSSPKDVCKRJCSCKTPPDPVNGMVH 

VITDIQVGSRINYS CTTGHRLIGHSSAECILSGNAAHWSTKPP I 

CQRI PCGLPPT I ANGD F I STWRENFHYGS WT YRCNPGS GGRKV 

FELVGEPS I YCTSNDDQVGIWSGPAPQCI IPNKCTPPNVENGIX, 

VSDNRS L FS LNE WEFRCQPG FVMKGPRR VKCQALNKWEPE L PS 

CSRVCQPPPDVLHAERTQRDKDNFSPGQEVFYSCEPGYDLRGAA 

SMRCTPQGDWS PAAP TCEVKS CDDFMGQLLNGRVL FP VMLQIiGA 

KVDFVCDEG FQLKGSS AS YCVLAGMESLWNSS VPVCEQI FCPS P 

PVIPNGRHTGKFLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTI 

RCTSDPQGNGVWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASD 

FP IGTSLKYECRPBYYGRPFS ITCLDNLVWSS PKDVCKR KSCKT 

PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 

rAHWSTKPPICQRIPCGLPPTIANGDFISTNRENFHYGSWTYR 

SNLGSRGRKVFEIiVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 

rp PNVENGI L.VSDNRS LFSIiNE WEFRCQ PGFVMKG PRRVKCQA 

^KWEPBLPSCSRVCQPPPEIIiHGEHTPSHQDNFSPGQEVFYSC 

SPGYDIiRGAASLHCTPQGDWSPEApRCAVKSCDDFLGQLPHGRV 

jFPL,NLQLGAKVSFVCDEGFRI.KGSSVSHCVLVGMRSLWNNSVP 

^CEHIFCPNPPAIIJfGROTGTPSGDIPYGKEISYTCDPHPDRGM 
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SEQ 
XD 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5942 



4509 



688 



5943 



2274 



Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E=» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proli ne, Q=Glu t amine , R*»Ajrginine, 
S=Serine, T«=Threonine, V« Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insert ion) 
TFNL I GES TI RCTS DPHGNGVWS S PAPRC E 1»S VRAGHCKTPEQF - 



PFASPriPINDFEFPVGTSLNYECRPGYFGKMFSISCLENLVWS 
SVEDNCRRKSCGPPPEPFNGMVHINTDTQFGSTVNYSCNEGFRL 
IGSPSTTCLVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSNN 
RTSFHNGTWTYQCHTGPDGEQLFELVGERSIYCTSKDDQVGVW 
SSPPPRCISTNKCTAPBVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGSHTVQCQTNGRWGPKIiPHCSRVCQPPPEILHGEHTLSHQ 
DNFSPGQEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 
CDDFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRUCGRSASHCV 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VS YTCD PH PDRGM TFNL I GES T IRRTS EPHGNGVWS S PAPRCEL 
PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDPGYLLVGK 
GFIFCTDQGIWSQLDHYCKEVNCSFPLFMNGISKELEMKKVYHY 
GDYVTIiKCEDGYTLEGSPWSQCOADDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFI LLI I FIiSWI I LKHRKGNNAHENPKEVAI HLHSQ 

ggssvhprtlqtneensrvlp 

ylytomranpiaVgishkayqidpplVrkhreqVlvieWgrkl 
dk\aqmirfeertgyfsstdlgrtashyyikyntietfnelfda 

HKTEGDIFAIVSKAEEFDQIKVREEEIEELDTLLSNFCBLSTPG 
GVENSYGKINILLQTYINRGEMDSFSLISDSAYVAQNAARIVRA 
LFE IAIiRKRWPTMTYRLLNLSKAIDKRLWGWAS PLRQFS I LPPH 
MLTRLEEKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 
S VMMEAFI QP ITRTVLRVTLS I YADFTWNDQVHGTVGE PWWI WV 
EDPTNDHI YHSEYFLALKKQVI SKEAQLLVFTI PI FEPLPSQ YY 
IRAVSDRWEXSAEAVCIIKFQHIiILPERHPPHTELIjDLQPLPITA 
LGCKAYE ALYNFSHFNPVQTQ I FHTL YHTDCNVLLGAPTGSGXT 

vaaelai frvfnkyptskavyiaplkalvrermddwkvrieekl 
gkkvieltgdvtpdmksiakadlivttpekwcgvsrswqnrnyv 
qqvtiliideihllgeergpvlevivsrtnfisshtekpvrivg 
lstaianardladwlnikqmglfnfrpsvrpvplevhiqgfpgq 
hycprmasmnkpafqairshspakpvlifvssrrqtrltaleli 
aflateedpkqv^nweremeniiatvrdsnlkltijvfgigmhh 
aglherdrktveelfvnckvqv^ I atstlawgvnfpahlvi I kg 
teyydgktrryvdfpitdvlqmmgragrpqfddqgkavilvhdi 

KKDFYKKFLYEPFPVESSLLGVLSDHLNAErAGGTlTSKQDALD 
YI TWTYFFRRLIMNPS YYNLGDVSHDSVNKFLSHL IBKSIi I EliE 
hS YCIE IGEDNRS IEPLTYGRIASYYYLKHQTVKMFKDRLKPEC 
S TE ELLS I LSDAEE YTDLP VRHNEDHMNS ELAKCL P I ES NPHS F 
DSPHTKAHLLLQAHLSRAMLPCPDYDTDTKTVIJDQALRVOQAML 

dvaanqgwlvtvlnitnliqmviqgrwlkdsslltlpnienhhl 
hlfkkwkpimkgphargrtsteclpelihacggkdhvfssmves 
elhaaktkqawnflshlpeinvgisvkgswddlveghnelsvst 
ltadkrddnkwiklhadqeyvlqvslqrvhfgfhkgkpescavt 
prfpkskdegwflilgevdkrelialkrvgyirnhhvaslsfyt 
peipgryiytlyfmsdcylgldqqyd/nlsqrytsesfctgqhq 

GL 

dkptrhktylssswakmaaaegpvgd gelwOtwlpkhvVflrlr 
eglknqspteaekpassslpsspppqlltrnwfglggelflwd 
gedssflwrlrgpsggg\eepalsqyqrllcinpplfeiyqvl 
lsptqhhvaligikglmvlelpkrwgknsefeggkstvncsttp 
vaerfftsstsltlkhaawypseildphwlltsdnviriyslr 
epqtptnviilseabeeslvlnkgrayxaslgetavafdfgpla 
avpktlfgqngkdewayplyilyengetfltyisllhspgn/i 

WKAVGS IAHAS \ AAEDNYG YDACAVLCLPCVPNI LVIATESGML 

yhcwlegeeeddhtsekswdsridlipslyvfecvelelalkl 
asgeddpfdsdfscpvklhrdpkcpsryhctheagvhsvgltwi 
hklhkfi^sdeedkdslqelstbqkcfvehilctkplpcrqpap 

IRGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR 

edvevaesplrvlabtpdsfekhirsilqrsvakpaflkasekd 
iapppeeclqllsratqvfreqyilkqdlakeeiqrrvkllcdq 
kkkqledlsycreerkslremaeriadkyeeakekqedimwrmk 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
loca t ion 
co r re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Al anine, C«Cysteine, D^Aspartic Acid, E= 
uxutamxc. Acia< r^Jrfienyj. alanine , G=Glycme, 
H=Histidine, Is=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P= Proline, Q=Glutaraine, R=Arginine, 
SaSerine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHS FHSEL PVLS DSERDMXKELQLI PDQLRHLGNAI KQVTMK 
KDYQQQKMEKVLSLPKPTIILSAYQRKCIQSILKEEGEHIREMV 
KQINDIRNHVNF 


5944" 




3428 


FS I ATFTDEPEVLTEPPS ATTTTTIGI SATWTTLAGSHGKRNNT 

ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSQPEKVNGESKS 

SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 

KKQPSVLVTFPKEERKSVSGKASIKLSETISEGT5NSLSTCTKS 

GPS PLS S PNGKL7VASPKRGQKREEGWKEWRRS KKVSVPSTVI 

SRVIGRGGCNINAIREFTGAHIDIDKQKDKTGDRIITIRGGTE3 

TROATQLINALIKDPDKBIDELIPKNRLKSSSANSKIGSSAPTT 

TAANTS LMG I KMTTVALS STS QTATALTVPAI S SASTHKTIKNP 

VNXNVRPGFPVSFPXijAYPPPQFAHALLAAQTFQQIRPPRLPMT 

HFGGTF P PAQS TWG P FP VR PLS PARATN S P K PHMVPRHS NQNS S 

GSQVNSAGSliTSSPTTTTSSS ASTVPGTSTNGS PSS PSVRRQLF 

VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS? 

SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 

TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQBPRPPLQQSQVPPP 

EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 

PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 

LPSTLSTQSACQNS VHPANKP I APNFS APL PFG P FS TLFENS PT 

SAHAFWGGSVVSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 

FRPPLQRPAPSPSG I VNMDS P YGS VTPSS THLGNFASNI SGGQM 

YG PGA PIjGGAPAAAN FNRQH FSPLS LLTP CSS ASNDS S AQS VS S 

G VRAP S PAPSSVPLGS EKPS NVS QDRKVP VPIGTERS AR I RQTG 

TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 

IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 

VGGMPFS VYGNAM I PP VAP I PDGAGGP I FNGPHAADPS WNSLI K 

MVSSSTENNGPQTVWTGPWAPHMNSVHMNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRNGIAEDLKGQADFFFLLVSEAVVATGSPRA 
W LTCL, I LPL PGI I FS VL PKAMS R PLLI T FTPATD PS DLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKKRR IMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAA 
QEGDLPELRRLLEPHEAGGAGGNINARDAFttWTPLMCAARAGQG 
AAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESH 
GETRS P ENRS PTPS LQ YCENCDTHFQDSNHRTS TAKLL S LS QG P 
QP PNLPIX3VP I SSPGFKLLLRGGWE PGMGLGPRGEGRANPI PTV 
LKRDQEGLG YRS APQ PRVTHF PAWDTRAVAGRE \ TP PRVATLSW 
REERRREE \KDRAWERDLRTYMNLEF 


5946 


541 


i«i6«i 


ilgsyssiqpeeys\swc\ewlqdlla\yvspk\hsylrdlp 

SEGS PQRVNS IDFV\EL\EHLQPDVLVHAVLRVVDF /TILTEAV 
YS YRGQKQKKVMLTVEQAQDQHYALVLWGPGAAW \ YPQLQRKKG 
YIWEFKYIjFVQCNYTLENLELHTTPWSSCECLFDDDIRAITFKA 
KFQKSAPS FVKISDLATHLEDKCSGVVL IRAQI SELAFP I TASQ 
KIALNAHSSLKSI FSSLPNI VYTGCAKCGLELETDENRI YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVIVPSSEITYGMVVADLFHSLIiAVSAEPCVLKIQSLFVL 
jj&ino z FLi yj\JLi c i>LiijU±? x PtJlVKHGANARL 


5947 | 


3 


1317 


RG I PDRRRRGP IGRVNMDLENKVKKMGLGHEQGFGAPCLKCKEK 
CEGFELHFWRKICRNC\NVAKKSM/TVLLSNEEDRKVGKIiF3DT 
KYTTblAKLKSDGlPMYKRNVMILTNPWUUCKNVSINTVTYEWA 
P P VQNQALARQ YMQML P KE KQP VAGS EGAQ YRKKQLAKQL PAHD 
QDPSKCWEI^PREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMN I PGGDRS TPAAVGAMEDKSAEHKRTQ YS CYCCKLSMKEG 
DPAIYAERAGYDKLWHPACFVCSTCHELLVDMIYFWKNBKLYCG 
RHYCDSEKPRCAGCDBLIFSNEYTQAENQNWHLKHFCCFDCDSI 
LAG E I YVMVNDKP VC KP CYVKNHAVVCQGCHNAI DP EVQRVT YN 
NFS WHASTECFLCSCCS KCL I GQK FM P VEGMVFCS VE CKKRM S 


594 8 S 


39 


3370 


YRERYPVSGGSVIiRSALEVC^DFLSGLTEGSliLtPEGFFSGPIDQ 
^HYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEI E IEGRLHRIS I FDPLE I ILEDDLTAQEMSE CNSNKENSERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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SEQ 
ID 

NO: 



S949 



Predicted 
heginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
! nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Xsoleucine, K« Lysine, 
L^Leucine , M=Me thionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion. 



39 



""33 70" 



5951 



"143" 



\apossible nucleotide insertion) 
S PPSAPRRP P V YYKF I EKSABELDNEVE Y1>Ml)EEDYAWLEI VNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGBQQSLIDBDA 
VCCI CMDGECQNSNVTLFCDMCNLAVHQECYG VPY I PEGQWLC / 
RAHCLQS RARPADC VLC PNKGGAF KKTDDDRWGHV\ VCALW \ I P 
E\VGFANTVFrEPIDGVRNIPPARWKLT\CNLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYM KMEPVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLN I YGDVEMKNGVCRKESS VKTVRS TS KVR 
KKAKKAKKALAEPCAVLPTVCAPYI PPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKIiKREQVKVEQVA 
M EL R LTPLTVLL RS VLDQLQDKD PAR IFAQ P VS LKE VPD YLDH I 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGWliRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRS KRAKLLKKE I ALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGE E VL PRLETXLQPRKRS R S TCGDS EVE EES PG KRLDAGL 
TNGFGGARSEOEPGGGLGHTfATDPPwr'nGwooTcjcsoxTo^T 



- „ '^^«^'^v*v^xxjiijUFKOI51jlSL;XJEWGNYAKAARZAAEV 
GQSSMWXSTDAAASVLEPLKVVWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMOTKSDEKLFLVLFFDNKRSWQ 
WL PKS KMVPLG IDETIDKLKMMEG RNS SIRKAVR I AFDRAMNHL 
SRVHGEPTSDLSDID 

yRERYPVSGGSVLRSALEVc WDFLSGLTEGSLLPEGFFSGPIDp ' 




373 



"§449" 



v * wRunK loir ujfiiis x a JbbDDI»TAQEMS ECNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIV3Y 
S PPS APRRPPVYYKFIEKS AEELDNEVE YDMDEBD YAWLE I VNE 
KRKGDCVPAVSQSMFEPLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCI CMDGEOQNSNVI LFCDMCNLAVHQECYGVP Y I PEGQWLC / 
RAHCLQ S RARPAD CVLCPNKGGAFKKTD DDRWGHV \ VCALW \ I P 
E\ VG FANTVFIE P IDGVRN I PPARWKLT\ CMLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYM KMEPVKELTGGGTTFSVRKTA 
YOTVHTPPGCTRRPl^IYGDVEMKNGVCKKESSVKTVRSTSKVR 
KKAKKAKKALAE PCAVLPTV CAP Y I P PQRLNRI ANQ VAI QRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRS VLDQLQDKDPAR I FAQPVSLKE VPDYLDHI 
KHPMDFATMRKRIiBAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGVVLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWBDVDRIiDPANRAHlXJLEEQLRELLDMLDLTCAMKSSG 
S RS KRAKLLKKE IALLRNKLSQQHSQPL PTG PGLEG FESDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TKGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTIiEDRSELIS CI ENGNYAKAARI AAB V 
GQS SMW I S TDAAAS VLB PLKWWAKCSG Y PS Y PALI I D PKMPR V 
PGHHNG VTI PAP PLD VLKIGEHMQTKSDEKLFL VLFFDNKRS WQ 
WL P KS KMVPLG I DET ID KLKMMEGRNS S I RKAVRI AFDRAMNHL 
SRVHGEPTSDLSDID 

ESRSLTMSTSQPGACPCQGAASRPAI LYALLSSSLKAVPR^RSR 
CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 
DQRRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKEVYACL 
KGPILFNPDVPGLOAASHIGHIX3QEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKSI PTSLLGDLFFRPI I GD VP I AGLLGDMLLLR 

WNVKPSLLWQLFKFSDKEE HEQNDSISGKTGETGVEBMIATR K 
VEQDS KETVKLSHEDDHI LEDAGSSDI SSDAACTNPNKTEWS LV 
GL PS CVDE VTE CNL ELKDTMG I ADKTENTLERNK I EPLG YCEDA 
ESNR0LESTBFNKSNLEVVT>TC'rcr2r>TroNTTT T?M-n T «r»^r*s«^, . 



wuimxosaiVAfionB AMwijyuUKJMSQSSSVSYLESKSVKSKHTKP 
VIHSKQNMTTDAPKKJVAAKYEVIHSKTKVNVKSVKRNTDVPES 
QQNFHRPVKVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKKTLQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
1 residue of 
I amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D=Aspartic Acid, Er 
Glutamic Acid, ^Phenylalanine, G-Glycine, / 
H=Histidine, I^Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N-Asparagine , 
P»Proline, Q»Glutamine, R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, .*=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQIFKPbTHSLSDKSHAHPGCLKEPHHPAQTGHVSHSSQK" 
QCHKPQQQAPAMKTNSHVKBELEHPGVEHFKEEDKLKLKKPEKN 
LQPRQRRSSKSFSLDEPPLFIPDNIATIRREOSDHSSSFESKYM 
WTP S KO. CG F CKKPHGNRFM VG CGRCDDW FHGDC VGLS LS QAQQM 
GBEDKEYVCVKCCAEEDKKTEILDPDTLENQATVEFHSGDKTME 
CE KLGL S KHTTNDRT K Y I DDT VKHKVKI LKRESGEGRNS SDCRD 
NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKESTTVTCTGEK 
AS K PGTHE KQEMKKKKV \E KGVIoNVHPAASAS KPS ADQ I RQS VR 
HSLKDILMKRLTDSNIiKVPEEKAAKVATKI EKELFS FFRDTDAK 
YKNK YRSLM FNLKD P KNNT LFKKVLKGEVTPDHL I RMS P EE LAS 
KELAAWRRRENRHTIEMrEKEQREVERRPITKITKKGEIEIESD 

apmkeqeaameiqepaankslekpegsek\r:<eevdsmskdtts 

QHRQHLFDLNCKI C IGRMAPP VDDLS PKKVKWVGVAR KHSDNE 
AESIADADSSTSNILASEFFEEEKQESPKSTPSPAPRPEMPGTV 
EVES TFLARLN F I WKGFINMPSVAKFVT KAY? VSGS PE YDTEDL 
PDS I Q VGGR I S PQTVWD YVE K I KASGTKE I CWRFT P VTE EDQ I 
S YTLLFAYFSSRKR YGVAANNMKQVKDMYL I PLGATDKI *>HFLV 
P FDG PGLE LHRPMLLLGL 1 1 RQKL KRQHS ACAS TSH I AETPE S A 
PP I ALi P PD KKS KI EVS TEEAPEE ENDF FNS FTTVLHKQRNKPQQ 
NLQEDLPTA VE PLME VTKQEP P KPhR FI#PG VL I GWENQPTTLEL 
ANKPLPVDDILQSLIjGTTGQVYDQ\AQS VMEQNTVKBI P FLNEQ 
TNS K I E KTDNVE VTDGENKE I KVKVDN I SES TDKSAE I ETS WG 
SSSISAGSLTSLSLRGKPPDVSTEAFLTNLSIQSKQBETVESKE 
KTLKRQLQEDQENNLQDNQTSNSS PCRSNVGKGNIDGNVS CSEN 
LVANTARSPQFTNLKRDPRQAAGRSQPVTTSESKDGDSCRNGEK 
HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 
PS VENI QTS QAEQAKP LQE DI LMQNI ETVHP FRRGSAVATSH FB 
VGNTCPSE F P S KS ITFTSRSTS PRTS TNFS PMRPQQPNLQHLiKS 
S P PG FP FPGP PN FPPQS MFGFPPHL P PPLLP P PGFG \ FA\ QNPM 
VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 
RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHUCRERHEKE 
WBQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGEKDRDRYH 
KDRDHTDRTKSKR 


5952 
5553 


3226 


639 


PPARRSARDliPRALSMEAARPSGS WNGALCRLL \LVTlI \AFI>I F 
ASDACKNVTLHVPSKIiDAEKIiVGRVNLKECFTAANLIHSSDPDF 
QI LEDGS VYTTNT I LIiS S E KRS FT I LltSNTENQE KKK I FVFL EH 
QTKVLKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 
QS DTAQNYT I Y YS I RG P G VDQEPRNL F YVERDTGNL YCTRP VDR 
EQYESFEIIAFATTPDGYTPELPLPLIIKIEDEMDNYPIFTEET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYSI IGQVPPS 
PTLFS MHPTTG V I TTTS S QLDREL I DK YQLK I KVQDMDGQ YFGL 
QTTSTCI INIDDVNDHLPTFTRTSYVTS VEENTVDVEILRVTVE 
urajuvn liu^WKANYTIIjKGNENGNFKIVTDAKTNEGVLCVVKPL 
NYEEKQQMIIiQIGWNEAPFSREASPRSAMSTATVTVNVBDQDE 
GPECNPPIQTVRMKBNAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTIDENTGSIKVFRSLDREAETIKNGIYNITVLASDQG 
1 ^ XJa 1 Lj ^ ± x jjyD VNDNS PF IPKKT V I 1 CKPTMSSAEIVAVDP 
DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
GSYWPITVRDRJ^MSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVQLGKWAILAlLLGIALFFCILFTIiVCGASGTSKQPKVIPDD 
LAQQNL I VSNTE APGDDKVYS ANGFTTQT VGASAQG VCGTVGS G 
I KNGGQETI EM VKGGHQTSESCRGAGHHHTLDS CRGGHTE VDNC 
R YTYS E WHS FTQ PRLGEES I RGHTL I KN 


5954 


330 

32 I 


811 

] 
] 
\ 


PLLCNPDPGWYWWVKQESEISKESQEMDARPKI<DI/3FKEGQTIK 
bCIGNITNKKGGASKPRTARGGGLSLLPPPPGGKVTrPPPSS /V 
OjPSTNHVTPPS IPKSNHGGSDADIliI*DLDS PAPVTTPAPTPVS 
/SNDLWGDFSTASSSVPNQAPQPSNWVQF 






2130 I 
I 


^PPPPPKIi^MADLEAVIjADVSYIjMAMEKSKATPAARASKRIVL 

^epsirsvmqkyiaerneitfdkifnqkigfulfkdfclneine 

kVPQVKFYE2IKEYEKLDNEEDRLC3?SRQIYDAYIMKELLSCSH 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



j Amino acid segment containing signal peptide" 
{A»Alanine, CeCysteine, D-Aspartic Acid] E= 

I Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=*Histidine, I-Isoleucine, K=Lysine, 
L= Leucine, M»Methionine, NsAsparagine, 
PaProline, Q-G2utamine, Rt=Arginine, 

I S=Serine, T= Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apo3sible nucleotide insertion) 



SEQ 
ID 
NO; 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5356 



5958 



1726 



1705 



444 



PFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDtFQKPM 
ESDKFTRFCQWKNVELNIHLTMNEFSVHRIIGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKR I KMKQGETIiAMER I MLSLVSTGDCPFI 

VCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSBKEMRFYA 
TE 1 1 LGLEHMHNRFVVYRDL KPANI LLDEHGHAR I S \DLGLACD 
FSKKKPHASVGTHGYMAPEVLQKGTAYDSSADWFSLGCMLFKLI> 
RGHS PFRQHKTKDKHE IDRMTLTVNVELPDTFS PELKSLLEGLL 
QRDVSKRIiGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLlPP 
RGEVNAADAFDIG5FDEEDTKGIKLLDCX>QELYKNFPLVISERW 
QQEVTETVYEAVNADTDKI EARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILS VEETQ I KDKKCILFRI KGGKQFVLQCESDPE FVQWKKELNE 
TFKEAQRLLRRAP KFLNK PR5GTVELP KPS LCHRNSNGL 



139 



KREREFRIiAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR 
PANRQDVLSGWINLPVLQLTKDPLKTPGRIiDHGTRTAFIHHREQ 
VWKRCINI WRD VGliFGVLNEI ANS EEEVFE W VKTASGWALAL CR 
WAS SLHGS LFPHLS LRSEDLI AE FAQVTNWS S CCLRVFAWH PHT 
NKFAVALLDDSVRVYNASSTIVPSLKHRLQRNVASLAWKPLSAS 
VIAVACQSCILIWTLDPTSLSTRPSSGCAQVLSHPGHTPVTSIjA 
WAPS GGRLLS AS PVDAA I R VWDVS TETCVPL P WFRGGG VTNLL W 
SPDGSKIIATTPSAVFRVWEAQMWTCERWPTLSGRCQTGCWS PD 
GSRLLFTVLGEPLIYSLSFPERCGEGKG\ALBVQSQQRLWQICL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 



1479 



451 



GVG VRGARAMATVQEKAAALNLiSALHS P AHR P PGF S VAQKP FGA 
TYVWS S I INTLQTQ VEVKKRRHRLKRHNDC FVG S EAVD V I FSHL 
I QNKYFGD\/DI PRAKVVRVCQALMDYKVFEAVPTKVFGKDKKPT 
FEDS S CS L YR FTT I PNQDS QLG KENKLYS PAR YADAIiFKS S D I R 
SAS LEDLWENLSLKP ANS PH VNI SAILS PQ VI NE VWQEET I GRL 
LQLVDLPLLDSLLKQQBAVPKIPQPKRQSTMVNSSNYLDRGILK 
AYSDSQEDEWLSAAIDCSEYLPDQMWEISRSFPEQPDRTDLVK 
ELLFDAIGRYYSS REPLLNHLSDVHNG I AELLVNGKTE IALEAT 
QLLLKLliDFQNREEFRRLLYFMAVAANPSEFKLQKESDNRMWK 
RI FSKAI VDNKNLSXGKTDLLVI>FI*\MDHQKDVFKI PGTL\HKI 
VS \ VK\ LMAIQNGRDPNRDAG YI YCQRIDQRD YSNNTEKTTKDE 
LLNLLKTLDEDSKLSAKEKKK\IjLGQFYKCHPDI F I EHFGD 



3138 



ELQVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENI KNAMLI K 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 
SKKSDCSLFMFGSHNKKRPNNJbVIGRMYDYHVIiDMIELGIENFV 
5LKDIKNSKCPBGTKPMLIFAGDDFDVTEDYRRLKSLLIDFFRG 
P WSNlRIiAGZJB YVLHFTAZiNG KI YFRS YKLLLKKSGCRTPRI E 
LEEMGPSI^LVLRRTHLASDDLYKLSMKMPKALKPKKKKNISHD 
TFGTTYGRIHMQKQDLS KLQTRKM\KGLKKRPAERIT3DHEKKS 
KRI KKKLME LSQ PLLFHCVLLKRI IKHQS IQSFL 



! AAALGMIili W FPACQAFN LDVEKLT VYSGPXGS Y FG YAVDFH IPD~~ 
ARTASVLVGAPKANTSQPDI VEGGAVY YCPWPAEGS AQCRQI P F 
DTTNNRKI RVNGTKE P I EFKSNQ WFG \ ATVKA\HKGKS CGPVAP 
LLPTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGQGYCQAGFSLDFYKNGDLI VGGPGSFYWQGQVITASVAD I IA 
NYSFKDILRKLAGEKQTEVAPASYDDSYIiGYSVAAGEFTGDSQQ 
EL VAG I PRGAQNFGYVS I XNS YDMTFIQNFTGEQMAS YFG YTW 
VSDVNSDGLDDVLVGAPLFMEREFESNPRBVGQIYLYLQVSSLL 
FRDPQILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVPFAGKD 
QRGKVL I YNGNKDGLNTKPF PKFCQG VWAS HAVPSG FG FTLRGD 
SDIDKNDYPDLrVGAroTGKVAVYRARPVVTVDAQLLLHPMI IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQS IANTIVLMAEVQLD 
SLKQKGAI KRTLFLDNHQAHRVFPLVIKRQKSHQCQDFIVYLRD 
ETEFRDKLS P INISLNYS LDESTFKEGLEVKP I LNY YRENI VS E 
QAHI LVD CGEDNLCVPDLKLSARPDKHQVI IGDENKLMLI INAR 
NEGEGAYEAELFVM I PEEADYVGI ERNNKGFRPLSCEYKMENVT 
RMWCDLGNPMVSGTNYSLGLRFAVPRLEKTNMS INFDLQI RS S 
NKDN PDSNF VS LQ INITAVAQVEI RGVSHPPQ I VLP IHNWSPEE 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end I Amino acid segment containing signal peptide" 
nucleotide (A»Alanine, C*»Cy3teine, D=Aspartic Acid, B=: 

location Glutamic Acid, F«Phenyl alanine, G^Glycine, 

corresponding H=Histidine, Ialsoleucine, K=Lysine, 
to first L=Leucine, M=Methionine, N=Asparagine, 

amino acid P-Proline, Q^Gliit amine, R=Axginine, 

residue of S^Serine, T=Threonine, V=Valine, 

amino acid W=Tryp t ophan , Y=Tyrosine, X= Unknown, *-Stop 

sequence | Codon, /^possible nucleotide deletion, 
\°possible nucleotide insertion) 



595S 



EPHKBEEVGPIiVEHIYELHNIGPSTISDTILEVGWPFSARDEFIi" 
LY I FHIQTLGPLQCQPNPNINPQDI KPAASPSDTPEliSAFLRNS 
TI PHLVRKRDVHWEFHRQSPAKI LNCTNI ECLQI SCAVGRLEG 
GESAVLKVRSRLWAHTFLQRKNDPYALASLVSFEVKKMPYTDQP 
AKLPEGS I AI KTS VI WATPNVS FS I PLWVI IIAILLGLLVLAIL 
T1ALWKCG FFDRAR PPQEDMTDREQLiTNDKTP EA 



1166 I GTSG YAAQQLP 3 LLKERJB FHLGTLNKVFAS QWLNHRQW CGT KC ~ 

NTLFVVDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNPSRT 
LLATGGDNPNS LAIYRLPTLDPVCVGDDGHKDWI FS IAW ISDTM 
AVSG S RDG SMGLWE VTDDVLTKS DARHNVS RVP VYAH I THKALK 
01 PKEDTNPDNCKVRALAFNNKNKELGAVSLDG YFHLWKAENTL 
S KLLSTKLP YCRENVCLAYGSEWS VYAVGSQAHVSFLDPRQPS Y 
NVKSVCSRERG SGIRS VS FYEH1 1 TVGTGQGS LLF YD IRAQRFL 
EERLSACYGSKPRLAGENI>KLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTH CYD5 SGTKLFVAGGPLPSGLHGNYAGLWS 



2853 



870 



5961 



198 



"5962" 



20 



FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRAT^PLNKE"* 
LDWAS I NG FC BQIjNED FEGP P LATR LLAH KI QS PQE WEAI QALT 
VLETCMKSCGKRFHDE VGKFRFLNEL I KWS PKYLGSRTS EKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\rVKSDPKLPDDT 
TFPLPPPRPKNVI FEDEEKSKMLARLIiKSSHPEDLRAANTCIilKE 
MVQEDQKRMEKISKRVNAIEEVNNNVKLLTEMVMSHSQGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EAIiAEXLQA 
NDNLTQVINLYKQLVRGEE VNGDATAGS I PGSTS ALLDLSGLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSLLDDELMSLGLSDPTP 
PSGPSLDGTGWNS FQSSDATEPPAPAIAQAPSMESRP PAQTSLP 
ASSGLDDIiDLLGKTLLQQSLPPESQQVRWBKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LESIKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LS TAPQP I RN I VFQSAVp KVMKVKLQP PSGTEL PAFNP I VHP SA 
ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 



WGSL 



314 ? | SGEPRPEPGNMATCIGEKIEDFKVGWLLGKGSFAGVYRAESIHT ' 

GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQI IT 
GMLYLHSHG I LHRDLTLSNLLLTRNMNIKIADFGLATQLKMPHE 
KH YTLCGTPNY I S P EIATRS AHGLES D VWS IK3CMFYTLL I GR P P 
FDTDTVKNTLNKWLAD YEM PTFLS IEAKDLIHQLLRRNPADRL 
SLSSVLDHPFMSRNSSTKS KDLGT VEDS IDSGHATISTAI TAS S 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEMLS VSKRS GGGENEER YS PTDNNANI F 
NFFKEKTS SS SGS FERPDNNQALSNHL.CPGKTP FPFADPTPQTE 
TVQQWFGNLQINAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDAS DNAHS VKQ QNTMK YMTALHS KP E 1 1 QQE C VF 
GSDPLSEQSKTRGM3PPWG YQNRTLRS I TS PLVAHRLKP IRQKT 
KKAWS I LDSEEVCVELVKE YASQE YVKE VLQI SSDGNTI TI YY 
PNGG\RGFPIA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
S RFVQL VRSKS P K I TYFTR YAKCI LM ENS PGADFEVW F YDG VK I 
HKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 
ICIALBS I rSEEERKTRSAPFFPIIIGRKPGSTSSPKALSPPPS 
VDSN Y PTRDRAS FNRMVMHS AAS PTQAP I LNPSM VTN3GLGI/TT 
TASGTD I S SNS L KDCLPKS AQ LL KS VF VKNVG WATQ\ LTSGAVW 
VQFNDGSQLWQAGVSS ISYTSPNGQ\TTR\ YGENEKLPDYI KQ 
KLQCLS S I LLMFSNP TPNFH 



2447 



RVCSS s astasqavmAdawee i RRIiAADFQRAQFAEATQRLs ER 
NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GR VN I VDLQQ VINVDL IH IENRIGDI I KSEKHVQL VLGQLI DEN 

yldrlaeevndklqesgqvtiselcktydlpgnfltqaltqrlg 

R I ISGH I DLDNRG V I FTEAF VARHKAR IRGLFSA I TRPTAVNS L 
ISKYGFQEQIil^YSVI/EELVNSGRLRGTVVGGRQDKAVFVPDIYS 
RTQSTWVDSFFRQNGYIiEFDAIWSRLGI PDAVSYIKKRYKTTQIiL 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L»Leucine, M«Methionine , N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








FLKAACVGQGLVDQVEASVEEAISSGTWVDIAPLIjPTSIiSVEDA" 
AI LLQQVMRAFS KQAST WFSDTVWSEKF\ INDCTEL FRELMH 
QKAEKEMKNNPVHLI TEEDLKQI S TLES VSTSKKDKKDERRRKA 
TEGSGSMRGGGGGNAREYKIKKVKKKGRKDDDSDDESQSSHTGK 
KKPEISFMFQDEIEDFLRKHlQDAPESFISEliAEYLIKPLNKTY 
LEWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALTKHLLKSVCTDITNLIFNFLASDLMMAVDDPA 
AI TS E I RKKI L S K LS EETKVALT KXHNSLNE KB I ED FI S CLDSA 
AEACD I MVKRGDK KRERQ I L FQHRQ ALAEQL KVTEDPAL I LHLT 
SVLLFQFSTHSMLHAPGRCVPQIIAFLNSKIPEDQHALLVKYQG 
LWKQLVSQSKKTGQGDYPLNNELDK3QEDVASTTRKELQELSS 
SI KDLVLKSRKSSVTEE 


5963 
"" £964 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAPGMP\GLMGSN 
GS PGQPG TPGS KGS KGEPG IQGMPGASGLKGEPGATGS PGE PG Y 
MGLPGIQGKKGDKGNQGEKGIQGQKGENGRQGIPGQQGIQGHHG 
AKGERGE KGE PG VRG AI GS KGES GVDGLMG PAG P KGQPGD PG PQ 

GPPGLDGKPGREFSEQFIRQVCTDVIRAQIiPVLLQSGRIRKrCDH 
CLSQHGS PGI PGP PGP I GPEGPRGLPGLPGRDGVPGLVGVP GRP 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
P PGD PGLPGKDGDHGK PG I QGQPGPPG I CDPSLCFS VI ARRDPF 
RKGPNY 


5965 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYKDPNSN 
P KI VQLLAGVADQTKDGL I S YQE FLAFES VLCAPDSMF I VA FQL 
FDKSGNGEVTFENVKEI FGQTI IHHHI P FNWDCEF I RLHFGHNR 
KKHLNYTEFTQFLQELQIiERARQAFALKDKSKSGMISGI»DFSDI 
MVT I RSHMLTP FVE ENLVS AAGGS I S HQ VS FS Y FNAFNS LLNNM 
ELVRKI YS TIAGTR KDAE VTKE EFAQS AI RYGQATPLEIDI I* YQ 
rJUDLYNASGRLTLADIERIAPLAEGALPYNIAELQRQQSPGLGR 
P I WLQ I AE S AYRFTLGS VAGAVGATAVYP I DL VKTRMQNQRGSG 
S WGELMYKNS FD CFKKVLR YEGFFGL YRGLI FQ L IG VAPB KAI 
KLTVKDFVRDKFTRRDGSVPLPAEVIiAGGCAGGSQVIFTNPLEI 
VKIRLQ VAGE ITTG PR VS ALNVLRDLG I FGLYKGAKACFLRD I P 
FS A I Y F P VYAHCKLLIiADENGHVGGLNIiLAAGAMAG \ VPAAS LV 
TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAARVFRSSPQFG\VTI»VTYELLQRGFYIDFGGI»KPAGSEPTPK 
S R I ADL P PANPDH IGG YRLATAT FAG I ENK FGL YLP KFKS PS VA 
WQPKAAVAATQ 




1 


1498 


MVTVnjYRFIiPTSNMAAKLRSLLPPDLRLQFWLHARLQKCFLSRG 
CGSYCAGAKASPLPGKMAMGLMCGRREIjLRLLQSGRRVHSVAGP 

sqwlgkpltrriilfpaapcccrphylflaasgprsls tsais fa 
evqvqappwaatpsptavpbvasgetadwqtaaeqsfaelgl 
gs ytpvgl iqnllefmhvdlglp wwgaiaact vfarcli fpli v 

TGQREAAR IHNHL PE I QKFS SR I REAKIiAGDH IE Y YKASSEMAL 
YQXKHG I KIjYKPL I L P VTQA P I F I S FF I ALREMANL PVPS LQTG 
GLW WPQDI*TVSDP I Y I LPLAVTATMWAVLELGAETG VQSSDLQW 
MRNVIRMMPL I TLP I TMHFPTAV FM YWIiS SNL FSLVQVSCLR I P 

AVRTVLKIPQRWHDLDKLPPREGFLESFKKGWKNAEMTRQLRE 
REQRMRNQI*ELAARG PL RQTFTHNP LLQPGKDNP PNI PSS \SS S 
SSKPKSKYPttHDTLG 




102 


1925 

< 
i 


RSKQVMARLTKRRQADrKAIQHLWAAIEIIRNQKQIANIDRITk 
YMSRVHGMHPKETTRQLSIiAVKDGL I VETLT VG r KCl <5 tr a r w 
G YWLPGDE I D WETENHD W YCFECHL PGE VL I CDLCFR VYHS KCI. 
S DE FRLRDS SS PWQCPVCRS I KKKNTNKQEMGT YLRFI VSRM KE 
RAIDLNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQIiLLHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 
KNCF YLANAR PDNWFC YPC I PNHELD WAKMKG FGFWPA2CVMQKE 
DNQ VDVRF FGHHHQRAW I PS EN I QD I TVNI HRLHVKRS MG WKKA 
:DELEUiQRFLREGRFWKSKNEDRGEEEABSSISSTSNEQLKVT 
3EPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
3VSTQTKKIiSASS PRMLHRSTQTTNDGVCQSMCHDKYTKI FNDF 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 

1 location 
corresponding 
to first 
amino acid 
residue of 

1 amino acid 

| sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\t=possible nucleotide insertion) 




5967 






KDRMKSDHKi^TBRVVREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAMYHCCWNTSYCS IKCQQEHWHAEHJCRTCRRKR 




5968 


102 


1925 


RS KQVMAKiiTKKRQAiyjt'KAIQHLWAAIE 1 1 RNQKQIANI DR 2 TK 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 
G YWLPGDE I D WE TENHDW YC FE CHLPG E VL I CDLC PRVYHS KCL 
S DE FRLRDS S S P WQCP VCRS I KKKNTN KQ EMGT YIiRF I V«5RMITR 
RAIDUiKKGKDNKHPMyRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLIiHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGR FWKS KNEDRGEEEAES S ISSTSNEQLKVT 

QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRMKSDHKRETERWREAIjEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAMYHCCWNTSYCSIKCQQEHWHAEHKRTCRRKR 




5969 


81 


1288 


vrfprrggapptvltpgkqqgvfixspqrpgsepdipargqphpp 

RPVGVSTSAQAQVQPPAMHRRRLALGLGFCLLACTSLSVLWVyL 

BNWLPVS YVP Y YLPCPE I fnmklh ykrekplqpwwsq ypqpkl 
lehrptqlltltpwlapivsegtfnpellqhiyqplnltigvtv 
favgn / hfiiesaeeffkrgyrvhy yi ftdnpaavpgvplfiphtjt 

L SS I P I Q GHSH WEETSMRRMEriSQH I AKRAKRE VD YL FCLD VD 

MVFRNPWGPETLGDLVAA1HPSYYAVPRQQFPYERRRVSTAFVA 

DSEGDFYYGGAVFGGQVARVYEFTRGCHMAIIADKANGir^AAWR 

EESHLNRHFISNKPSKVI^SPEYLWDDRKPQPPSLKLIRFSTLDK 
DISCLRS 


5970 


1126 


533 


D VG FN I KRKRCDLD VFIiES PRK PSGRRDRAPEKQRR I AAN KCLC 
TG VREG E P P S /TTS QK VKEAGRDFT YLI WIjFG I S I TGGLF YT I 
FXELFSSSSPSKIYGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKQGTVYAQVKENP 
GS G E YDFR Y I F VE I ES Y PRRT III EDNRS QDD 




316 


4712 

; 

] 

I 
E 
I 
G 
P 


SQDxv iVjrtK J^LiUKHG WKLGQGLdl^ LQGRTDP I P I WK YDVMGMG — 
RMEMhJLDYAEDATERRRVLEVEKEDTEEZ^QKYKDYVDKEKAIA 
KALEDLRANF YCEL CDKQ YQKHQE FDNH INS YDHAHKQRIjKDLK 
QREFARNVSSRSRKDEKKQEKALRRLHEIAEQRKQAECAPGSGP 
MFKPTTVAVDEEGGEDDKDESATNSGTGATASCGLGSEFSTDKG 
GP FTAVQ I TNTTGLAQAPGLAS QG I S FG I KNNLGT PLQKLGVS F 

SFAK:<APVKLESIASVFKDHAEEGTSEDGTKPDEKSSE3QGLQKV 
GDSDGSSNLDG KKEDEDPQDGGSLAS ThS KLKRMKREEGAGATE 

peyyhyippahckvkpnfpfllpmraseqmdgdntthpknapes 

KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 
KAEAKKAI^GGDVSDQSLESHSQKVSETQMCESNSSKBTSIoATPA 
GKESQEGPKHPTGPFFPVLSKDESTAI.QWPSELLIFTKAEPSIS 
YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 
PNKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 
TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRJCKMKSSAPADSERGPKPEPPGSGSPA 
PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 
KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 
KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 

DAS S DOS PVfinnRf!VQnncVQr»VCnnnnr»t»A»™.«.._ 

^nooyyo v_ x skwkh x SODS YSDYSDRSRRHSKRSHDSDDSDYAS 
3KHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 
5 SCSRS RS KRRS RS TTAHS VIQRSRS YS RDRS RS TRS P SQR S GS R 

CRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
JDGRGDDSKATGPpSQNSNIGTGRGSEGDCSPEDKNSVTAICLLli 
IKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 
•GNKPVLPLIGKLPATRKPNKKCEESGL3RGBEQEQSETEBGP 11 
SSDALFGHQFP\SEETTGPIjLDPPPEESKSGBVTADHPVAPLG 
PAHFDCYLGDPT1SHNYLPDPSDGNTLESLDSSSQPGPVBSSL 
PIAPDIiEHFPSYAPPSGDPS iestdgaeda\si*aplesqpitf 
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SEQ 
ID 
NO: 



5972 



5973 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



53 



440 



65 



Predicted end 

nucleotide 

location 

co r r e sponding 

to first 

amino acid 

residue of 

amino acid - 

sequence 



2149 



Amino acid segment containing signal peptide^ 
(A^Alanine, C=Cysteine, 2>*Aspartic Acid, E= 
Glutamic Acid, F=* Phenylalanine, G=Glycine, 
H=Hi©tidine, I=Isoleucine, K^Lysine, 
L«Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q=*Glutamine, RsArginine, 
S -Serine, T« Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, x-unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 

U'PEfiMBKYSKI^QAAQQHlQQQI^KQVKAFPASAAIAPATPAi: 
Q P I H IQQ PATAS ATS I TT VQHAI LQHHAAAAAAA IG I H PH PH PQ 

PIAQVHHIPQPHLTPISLSHLTPISIIPGHPATPLASHPIHIIPA 

SAIHPGPFTFHPVPHAALYPTLIAPRPAAAAATALHLHPLLHPI 
FSGQDLQHPPSHGT 

S>y L> YFVGVDMDNP I GNWDGRFDGVQLCS FACVESTI LIjHIND II " 



"1761 



• 2007 



5974 



4293 



2200 



PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RS 2 L FYTLNGS S VDS QPQS KS KNTW Y I DE VAEDPAKS LTE I S TD 
FDR3 S P P LQ p PP VNS LTTENRFHSLP FSLTKM PNTNGS IGHS PL 
SLSAOSVMEEI^TAPVQESPPIiAMPPGNSHGLEVGSLAEVKENP 
PFYGVIRWIGQPPGLNEVU\GLELEDECAG\CTDGTF/REGTRY 
FTCALKKALFVKLKSCRPDSRFASLQPVSNQIERCNSLAIWEAv 
LSEWEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTLF 
CLFAFSSVLDTVLLRPKEKNDVEYYSETQEIjLRTElVNPLRIYG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFTiNILFHHILRV 
BPLIjKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 
SNLKFAEAPSCLI IQMPRFGKDFKLFKKI FPSLELNITDLLEDT 
PRQCRICGGLAMYECRECYDDPDISAGKIKQFCKTCNTQVHLHP 
KRLNHKYNP VSLPKDLPDWDWRHGCI PCQNMEL FAVLCIETSH Y 
VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYI, 
KMSLEDLKSLD5RRIQGCARRLLCDAIYVPCTQ S PTMSLYK 
ILLAGSPSPRDQCSQRQSSGGDKEbVTRGCTFSTAWSPSAMTQ 
EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PSAIVSFTVSRRNANVIPNFQII*FVSTFAVTTTCE.IWFGCKLVI» 
NPSAININFTTLILLLLLELIJ^AATVIIAARSSEEDCKKKKGSMS 
DSANILDEVPFPARVLKS YSWE VIAGISAVLGGI IALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVBVLIAISSIi 
TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLIiLLLV 
LLLQA/GPQHGHRHPVRAIiQGQCKAAGCILGHPERPAGAPGWGG 
GQE P P EG VRQGE SLES RRGANGP VTPRRGNRVAAPS LAPGMETH 
NP 

NGDG KDL FGH I WAWRSN G 1 1 SNFRRS P HAGMAEDE P DAKS P KTG 
GRAP PGGAE AGE PTTLLQRLRGT I S KAVQNKVEG I LQDVQKF S D 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCIjPKQSVYDAYRKYCESLACCRPIjSTANFGKI IRE I FPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
E VT PAPRDELVEAACALTC DWAER I LKRS FSS I VE VARFLLQQH 
L I SARS AHAKVLKAMGLAE EDEHAPRERS S KPKNGL EN PEGG AH 
KKPERLAQPPKDLEARTGAGPLARGBRKKSWESSAPGANNLQV 
NALVARLPLLJLPRAPRSLIPPIPVSPPIXAPRLSSGALKVATLP 
LSS RAG AP PAAVP I INM I LP TVPALPGPG PGPGRA P PGGLTQ PR 
GTENRE VG IGGDQGPHDKG VKRTAE VP VS EASGQAP PAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
! PWETW< 3SGGBGNSAGGAERPGPMGEAEKGAVLACG\QGDGTVSK 
GGRGPGS OHTKEAE0KI PLVPS KVSVI fCGSRSQKEAFPLAKGE V 
DTAPQGNKDLKEHVLQSSLS QEHKDPKATPP 
IxGLQ^IHTTSGRIHQAMVT^L MKD^fESVTVEWlK^JGDTKGK^BID 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
D I S P VQAAKKE FG PPS RRKSNC VKEVE KLQE KREKRRLQQQELR 
EKRAQDVDATNPNYE I MCM IRDFRGSLDYRPLTTADP I DEHRI C 
VCVRECRPLNKKETQMBCDLDVITI PSKDWMVHEPKQKVDLTRYL 
E NQTFRFD YAFDDS APNEMV Y R F TARPLVET I FERGMATCFAYG 
QTGSGKTmMGGDFSGKNQDCSKGI YALAARD VFLMLKKPMYKK 
LELQVYATFFEIYSGKVPDIJJ^RKTKLRVLEDGKQQVQVVGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQI ILRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECIRALGRmCPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI \VD 
LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine , DoAspartic Acid, E= 
Glutamic Acid r F-Phenylalanine, G=Glycine, 
H»Hietidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y«Tyrosine, X=UnIcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








EEQWEDHRAVFQES I RWLEDEKALLEMTEEVDYDVDS YATQLE 
AILEQKIDILTELRDKVKSFRAALQEEEQASKQINPKRPRAL 


5975 


4293 


2200 


lglqmhttsgrihqamvtslnednesvtvewiengdtkgk\eid 
le s i fslnp \ dl \ vpdgei e ps p \ et ppp passakvnk i vknrr 
tv\asikndpps\rdnrwgsararpsqfpeqfssaqqngsv\s 
d i s p vqaakkefg p ps rrksncvkeve klqe krekrr lqqqelr 
ekraqdvdatnpnye i mcmird frgsld yr plttadp i dehri c 

VCVRKRPLNKKETOMKDTiDVT T T V><ZKr>\r\7M\F-lT? c irninmr tdvt 
ENQTFR FD YA FDDSAPNEMV YR FTARPL VETI FERGMATCFA YG 
QTG5GKTHTMGGD FSG KNQD CS KG I YALAARDVFLMLKKPN Y KK 
LE LQVYATF FE I YS G KVFDIiTiN R KTKLRVLE DGKQQ VQWGLQE 
RE VKCVEDVLKLID IGNSCRTSGQTSANAHS SRSHAVFQ I ILRR 
KGKLHGKFSL,IDLAGNERGADTSSADRQTRLEGAElNKSIitiAIiK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS CENTLNTLRYANRVKE LT VDPTAAGDVRP I MHHP PNQ I \ DD 
LETQWGVGSS PQRDDLKLLCEQNEEEVS PQLFTFHBAVSQMVEM 
EEQWEDHRAVFQBSIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AILEQKIDILTELRDKVKSFRAALQEEEQASKQ1NPKRPRAL 


5976 


20 


2949 


VHHLHLTRVSVWNLDI I LR I AQQMGI KTLNL VI/3 \LKRA\LEF 
PEVS WMEVKD PNMKGAMLTNTGKYAI PTI DA\EAYAIGKKEKPP 
FLPEEPSSSSEEDDPIPDELLCLICKOIMTDAWIPCCGNSYCD 
BC IRTALLESDEHTCPTCHQNDVS PDALIA^KFLRQAVNNFKNE 
TG YTKRLRKQLPS PPP P I PPPRPLIQRNLQPLMRSP I SRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 
ATVS IS VHSE KS DGPFRDS DNKI LPAAALASEHS KGTSS I AITA 
LNEEKGYQVPVLGTPSLLGQSLIiHGQIilPTTGPVRINTARPGGG 
RPGWEHSNKLGYLVSPPQQIRRGERSCYRSINRGRHHSERSQRT 
CGPSIjPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 
APP LSREE F YREQRRL KEEEXKKS KLDEFTND FAKELME YKKI Q 
iuiniuwroAaALiriauoo loKoo J I ioKSKSGSTKSRSYSRSFS 
RSHSRS YSRSPP YPRRGRGKSRNYRSRSRSHGYHRSRSRS PP YR 
RYHSRSRSPQAFRGQSPNKRNVPQGETEREYFNRYREVPPPYDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYEKYYKGYAAGAQPR 
PSANRENFS PERFLPLNIRNS PFTRGRREDYVGGQSHRSRNIGS 
NYPEKLSARIXJHNQKDNTKSKEKESEWAPGIX3KGNKHKKHRKKR 
KGEESEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDEPMDAES I T FKS VSE KDKRERDKP KAKGDKTKRKNDGS AV 
S KKEN I VKPAKG PQEKVDG \ DVRDIiLDLNlj\QIiKKP KEETPKDIj 
TILNHHLj PLRRMKKSL \ E P P \ EKLTLNQQK\TPRNKTSQRG KS E 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHSINHILHPGAGVAAGPAtGW/REYLT 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGXQFLVTKNVPCYKRCKQME YSDE LEAI IEEDDGDGGWV 
DTYHNTG I TG I TEAVKE ITLENKDNIRLQDCSALCEEEEDEDEG 
EAADMEEYEESGIiLETDEATtiDTRKIVEACKAKTDAGGBnAIliQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVTIENHPHIjPPPPMCSVHPCRHAEVMKKIIETVAEGGGEL 
GVHMYLLI FLKFVQAVIPTI EYDYTRHFTM 


5978 


160 [ 


3213 


rdgarrwggcqspltwapgfyrrfdlatsgrrlrgqtaepagrq 

RPRREPEAMDEQSVESIABVFRCFICMEKIiRDARLCPHCSKLCC 
FS CIRRWLTEQRAQC PHCRAPLQLR EIiVNCRWAEE VTQ QLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMH 
GGHTFKPLAE I YE QH VTKVNEE VAKLRRRLME LIS LVQE VE RN V 
E AVRNAKDE RVRE I RNAVEMM I ARLDTQLKN KL I T LMGQKTS LT 
QETELLESLLQEVEHQLRSCSKSELISKSSEILMMFQQVHRKPM 
AS FVTTPVP PDFTSELVPS YDSATFVLENFSTLRQRADPVYSPP 
LQVSGLCWRLKVYPDGNGWRGYYLSVFLELSAGLPETSKYEYR 
VEMVHQS CNDPTKNI IRBFASDFEVGECWGYNRFFRLDLLANBG 
YLNPQNDTVIIiRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
I JSfNLKERLT I E LSRTQKSRDLS P PDNHLS PQND DALETRAKKS A 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, CeCysteine, D^Aepartic Acid, E« 
Glutamic Acid, F-Phenylalanine, G=*Glycine, 
H^Histidine, X»Isoleucine, K-Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER \GP YSAS \VREAKEDEEDEEKIQNEDYHKELSDGDL 
DLDLVYEDEVNQLDGSSSSASSTATSNTBENDIDEETMSGENDV 
EYNNMBLEEGELMEDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 
ATSSLLDIDPLILIHLLDLKDRSSIENLWGLQPRPPASliLQPTA 
SYSRKI5KDQRKQQPJ^WRVPSDLKMIjKRLKTQMAEVRCMKTDVKN 
TLSEIICSS5AASGDMQTSLFSADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DRQCKALDSDAWVAVFSGLPAVEKRRKMVTLGANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECPTENEE 
C E EHTS VGG FHDS FMVMTQ PPDEDTHS S FPDGEO I GPEDLS FNT 
DENSGR 


5979 


212 


3655 


LPDMTM YLWLKLLAFGFAFLDTEVFVTGQS PTPS PTDAYLNASE 

TTTLSPSGSAVISTTTIATTPSKPTCDEKYANITVDYLYNKETK 

LPTAKLm^ENVECGNNTCTNNEVHNLTECKNASVSISHNSCTA 

PDKTLILDVPPGVEKVPVHCCSXQVEQPDSTIWLKWKNIETSTC 

DTQN I T YRFQCGNMI FDNKE I KLENLE PEHE YKCDS EI L YNS HK 

FTNAS KI IKTDFGSPGEPQI I FCRSEAAHQGVITWNPPQRS FHN 

FTLCYI KETEKDCLNLDKNLIKYDLQNLKPYTKYVLSLHAYITA 

KVQRNGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 

DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 

AYFHNGDYPGEP FILHHSTS YNSKALI AFLAFLI I VTS IALLW 

LYKIYDLHKKRSCNIjDEQQELVERDDEKQLMNVEPIHADILLET 

YKRKI ADEGRLFLAE FQS I PR VFS KFP I KEARKP FNQNKNR YVD 

ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYXAAQGPR 

DETVDDFMRM1 WEQKATVI VMVTRCEEGNRNKCAEYWPSMEEGT 

RAFGECCCKDLTIOiKKCPXDYIIQKI^IVNKKEKATGREVTHIQ 

FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIVVHCSAGVGR 

TGTYIG I DAM LEGL EAENKVDVYG YVVKLRRQRCLMVQVE AQ Y I 

LIHQALVE YNQ FGETEVNLS E LHP YLHNMKXRDP PS EPS P LE AE 

FQRLP S YRSWRTQHI GNQE \ ENXS KNRNSNVI P YDYNR VPLKHE 

LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 

AAQGPLKETIGDFWQMIFQRKVKVIVMLTELKHGDQEICAQYWG 

EGKQTYGDIEVDLKDTDKSSTYTLRVFELRHSKRKDSRrVYQ^Q 

YTNWSVEQLPAEPKELISMIOVVKOKLPQKNSSEGNKHHKSTPL 

L I HCRDGSQQTGI FCALLNLLES AETE E WDI FQ WKALRKAR P 

GMVSTFEQYQFLYD VI ASTYPAQNGQVKKNNHQEDKI E FDNEVD 

KVKQDANCVNPLGAPEKLP EAKEQAEG S E PTSGTEG PEHSVNG P 

ASPALNQGS 


5980 
5981 


3 


2363 

J 
] 
1 


DAWGCKLRRLRFTYGTQTRVSLALPGQYELVHTLVAHQGNWETI 
PEEDLE VQENNE DAAHDLTELE VTMHHAIjLQ E VDWVAPCQGLR 
PTVD VLGDL VND FLP VITYALHXDELS ERDEQELQE I R KYF S FP 
VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWW 
CGAPGQDTXAQS MLVEC2SEKLRHLSTFSKQVLQTRLVDAAKAIjN 
LVHCHCLD I FIKQAFDMQRDLQ ITP KRLE YTRKKENEL YESLMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTRE I KCCI RQ IQELI ISRLNQAVANKL I S S VD YLRES FVGTL 
ERCLQSLEKSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
ML WEQ I KQI I QR I TW VS PPAI TLE WKRKVAQEAI ES LS AS KLAK 

SICSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
DHAPRLARLSLESRSLQDVLLHRKPKLGQELGRGQYGWYLCDN 

WGGHFP CALKS WPDTYPITCJtfJMTYr A T DcniVMn^r r\rrrTnn> .- 

r» e umi rvo V V X t'Ut^lSMWNUliAXih.FH YMRSLPKHERLVDLKG 

SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLBTRLQIAL 
DWEGIRFLHSQGLVHRDIKLKNVLLDKQNRAKITDLGFCKPEA 
^MSGSIVGTPIHMAPELFTGKYDNSVDVYAFGILFWYICSGSVK 
LPEAFERCASKDHLVJNNVRRGARPERLPVFDEECWQLMEACWDG 
DPLKRPIiLGIVQPMLQGIMNRLCKS\NSEQPNRGLDDST 




1 


2519 ( 
I 

C 

3 


3RKHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 
DAPPPPAAPLPRWSGP IGVS WGLRAAAA\GGAFPRGGRWRRSAP 
3\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
[LVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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j SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 1 Ammo acid segment containing signal peptide 
nucleotide (A^Alanine, C»Cysteine, D^Aspartic Acid, E=> 
location Glutamic Acid, F=Phenylalanine, G=Glycine, 
corresponding H«Histidinc, I-Isoleucine, K=Lysine, 
to first L«Leucine, M=Methionine, N=Asparagine, 
amino acid P=Proline, Q=Glutamine, R=Arginine, i 
residue of S«Serine, T=Threonine, V*Valine, 
amino acid W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
sequence Codon, /=possible nucleotide deletion, " | 
\«possible nucleotide insertion) | 






TEFGMAIGPEKSGKVVLTAEVSGGSRGGRIFRSSDFAKNFVQTD 1 
LP FHPLTQMM YS PQNSDYLLALS TENGXiWVSKNFGGKWEE IHKA 
VCIiAK WGS DNTI FFTT YANGSCKADLGALELWRTSDLGKSFKTI j 
GVKIYSFGIiGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
SVGQEQF YS ILAANDDMVFMHVDEPGDTGFGTI FTSDDRG I VYS 
KSLDRHLYTTTGGETDFTNVTSLJIGVYITSVLSEDNS IQTMITF 1 
DQGGRWTHLRKPENSECDATAKNKNECSLHIHASYS I SQKLNVP 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EG PHY YT I LDSGG 1 1 VAI EHS S R P I NVI XFSTDEGQCWQTYTFT 
RDPIYFTGLASEPGARSMNISIWGFTESFLTSQWVSYTIDFKDI 
LERNCEEKDYriWLAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 
QNGRDYWTKQPSICLCSLEDFLCDFGYYRPSNDSKCVEQPELK 
GHDLEFCJLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQNSKSNSVPIILAIVGLMLVTWAGVLIVKKYVC 
GGRFLVHt> YS VLQQH \AEA\NG VDGVDALDTASHTNKSG YHDDS 
DEDLLE ^ | 


5982 
5983 


56 


2316 ATR ? PRGS S WCRQFSRTASAA PGRSNMLiR I PVRKAL VGLS KS P K I 
GC VRTTATAASNL I E VFVDGQS VMVEPGTT VIjQACE KVGMQ I PR 
FC YHERIiS VAGNCRMCLVE IEKAP K WAACAMPVM KG WN I LTNS 
EKSKKAREGVMEFLLANHPLDCPICDQGGECDLQDQSMMFGNDR 
SRFLEGKRAVEDKNIGPIiVKTIMTRCIQCTRCIRFASEIAGVDD 
LG TTGRGNDMQVGT Y IE KMFMS E L SGN 1 1 D I CPVGALTS KP YAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
EE WISDKTRFAYDGLKRQRLTEPMVRttE KGIJ/f yt<5 wpna t . cdv 

AGMUSSFQGKDVAAlAGGIiVDAEALVALKDLLNRVDSDTLCTEE 
VFPTAGAG TDLRSN YL LNTTIAG VEEAD WLL VGTN"PRFEA PbF 
NAR I RKS WIiHNDLKVAL IGS PVDLT YT YDHLGDS PKI LQD I ASG 
SHPFSQVLKEAXKPMWLGSSALQRNDGAAILAAVSSIAQKIRM 
TSGVTGDWKVMKILHRIASQVAALDLGYKPGVEAIRKNPPKVLF 
LLGADGGCITRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 
EKSATYVNTEGRAQQTKVAVTPPGLAREDWKI I RAIjSEIAGMTIj 
PYDTL\DQVRNRIiEEVSPNLVRYDDIEG\ANYFQQANEIiSKIjVN 
QQLLADPLVPPQLIWKDF YMTDS I SRASQTMAKCVKAVTEGAQA 
| VEEPSIC | 




248 


1763 


KAKUDGGRRRHRASGRRAG RG E P \ AGL KSQGQRA\/ PKRAVARGG 
RQ \ YS AAI ALLE PAGS E I ADDLS I L YSNRAACYLKEGNCS G C I Q 
DCNRALELH PFSMKPLLR RAMAYE TLEQ YGKAY VD YKT VLQ ID C 
GLQIJ^DSVNRLSRIIJ^ELIKIP^REKIaSLIPAVPASVPLQAWH 
PAKEMISKQAGDSSSHRQO^ITDEKTFKALKEEGNQCVNDKNYK 
DA LS K YS ECL KINNKECAI YTMRALC YLKLCQFE EAKQJDCDQAL 
QI^GNVKAFYRRALAHKGIiKNYQKSIilDLNKVILIiDPSIIEAK 
MELEEVTRIiLNLKDKTAPFNKSKERRKIEIQEVNEGKBEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
STRKDKEACAHLLAITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 

LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFEDL 
SDTPNNHFTLEDIQALKRQYEL | 


j 5984 
5985 


7SS 


1193 


SS VCMACTYVSNLGKKQRS V3 FLASGLMRVSTGPEltRLHHSFVli "j 

TGDVGRRICRLLVGLFTKGDTSS KRVHPFSPGPCFLL>CDLARVG 

SS PKINVSPFYQN\QTSTQRSCTVFVWQRCSIiVGPFQVTVFTMY 
FHHSLRS I S RFS SG 




~ 22 


1408 


RR VARPGTAEP AKARRTVRRGRARRDLAG AE RKAGVS E RODS GR 1 
RRPNPS I PSAAAGMSHIQI PPGiTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPASVLPAATPRQSLGHPPPEPGPDRVADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HP KTDEQRCRLQEACKDILLFKMLDQEQI*S Q VGQAM FER I VKAD 
EHVI DQGPDGDNF YVI B RGT YD I LVTKDNQTRS VGQ YDNRGS FG 
E LAI*M YNTP RAAT I VATSEGS L WG LDRVTFRRI I VKNNAKKRKM 
FES F I ES VP LLKS LE VS ERMK I VD VIGB K I YKR/DGER 1 1 TQG E 
K\ADSFYI I2SGEVSILIRSRTKSNKDGGNQEVEIARCHKGQYF 
GEIALVTNKPRAASAYAVGDVKCLVMDVQAFERLLGPCMDIMKR 
NISHYEEQLVKMFGSSVDLGNLGQ 
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SEQ~ 

ID 

WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaictea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
<A=Alanine, (^Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsaValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


5986 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
o utcr u& fKijFi' F PRIiG JjLGALMAEDG VRG S P P V PSG P PMEBD 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDAS I 
LISNVCS IGDHVAQEI,FCX3SDLGMABEAERPGEK\AGQHSPLRE 
EHVTCV0S I IjDEFLQT\ YGS L I PLS TDEWE KLED I FQQEPST P 
SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDIjGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKI,RrKGYDG 
VKRWTKNVD I FNKBLLLI PIHLEVHWSLI SVDVRRRTtTYFDSQ 
RTLNRRCPKHlAKYLQAEAVKKDRIiDFHO^WKGYFKMWARQNN 

DSDCGAFVLQYCKHLAliSQPFSFTQQDMPKLRRQlYKELCHCKIi 
J TV 


5987 


1806 


4 84 


DAWK^TSJjTFHWKLWGRHRGRRRGIJU^PKNHLSPQO^GATPQ'VP 
S PCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKS PLDPDSGLLS CTLPNG FGGQS G P EG ER SLAP PDAS I 
L I SNVCS IGDHV AQELFQGS DLGMAEEAER PGE K \AGQHS PLRE 
EHVTC VQS I LDE FLQT \ YGS L I P LSTDEWE KLED I FQQE FSTP 
SRKGLVLQLI QS YQRM PGNAMVRGFR VA Y KRHVLTMDDLGTLYG 
QNVnJNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVDI FNKELLL I P IHLE VH W S LI S VD VRRRTI T YFDSQ 
RTLNRRCPKH I AKYLQAEAVKKDRLD FHQG WKG Y FKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKBLCHCKL 
| TV 


5988 


1292 


410 


fkkyflsflgllesshsrdrihnlvlmfllathnlvwwftcrfq 
: rldciylnagimpnpqlnikallfglfsXaeglltqgdkitadg 

LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLEDFQHS KGKEPYSS S KYATDLLS VALNRNFNQQGLYSNVAC 
PGTALTNLTYGILPPFIWTLLMPAILLLRFFANAFTLTPYNGTE 
AL VWLFHQKPES LN PL I KYLS ATTGFGRNY IMTQKMDLDE DTAE 
KFYQKLLELEKH I R VTIQKTDNQARLS GS CL 


; 5989 


194 


2610 ( | 


AMDFPQHSQHVLEQLNQQRQLGLLCDCTFWDGVHFKAHKAVLA 

ACSEYFKMLFVDQKDWHLDISNAAGLGQVLEFMYTAKLSLSPE 

NVDDVL\ AVATFLQMQDI ITACHALKSLAEPATSPGGNAEALAT 

EGGEKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGQAQ 

S AASG AEQTEKADAPREPP P VEL KPDPTS GMAAAEAEAALSES S 

EQEMBVEPARKGEEEQKEQEE02EEGAGPAEVKEEGSQLENGEA 

PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 

KCED CG KEFTHTGNFKRH I R I HTGEKP FS CRECS KAFS DPAACK 

AHEKTHS PLKPYG CE E CGKS YR L I SLLNLRKKRHSG EARYRCED 

CGKLFTTSGNLKRHQLVHSGEKPYQCDYCGRSFSDPTSKMRHLE 

THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 

FTTSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 

EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 

SQIiANHIRHHDNIRPHKCSVCSKAFVNVGDLSKHI I IHTGEKPY 

LCDKCGRGFNRVDNLRSHVKTVHQGKAGIKILEPEEGSEVSWT 

VDDMVTLATEALAATAVTQLTVVPVGAAVTADETEVLICAEISKA 

VKQ VQEE DPNTHILYACDS CGDKFLDANS LAQHVR IHTAQ ALVM 

FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 

1 i 

< 
1 
I 

[ I 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSLSRLGPSLRDKDLEMBELMLQDETLLGTMQSYMDASLIS 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEEVASFSGQILAGELDNCVSSIPDFP 
MHLACPEEEDKATAAEMAVPAAGDES ISSLSELVRAMHpycLPN 
LTHLAS LEDELQEQPDDLTLPEGCWLEI VGQAATAGDDLEI PV 
WRQVSPGPR PVLLDDSLETSSALQLLMPTLESETEAAVPKVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
3S QRARKGR KKKS KEQPAACVEG YARRLRS S SRGQSTVGTE VTS 
2VDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENSSPKN 
jERSAGQSSPAKEGPLDLYPKLADTIQTNPIPTHLSLVDSAQAS 
^MPVDSVEADPTAVGPVLAGPVPVDPGLVDLASTSSELVEPLPA 
CPVLINPVIiADSAAVDPAVVPISDNLPPVDAVPSGPAPVDLALV 
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SEQ ' 

ID 

NO; 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=sArginine, 
S^Serine, T=Threonine, V* Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQIjLVESES" 

LDPPXTIIPEVKEVVDSLKIESGTSATTHEARPRPLSLSEYRRR 

RQQRQAETEEKSPQPPTGKWPSLPETPTGIADI PCLVI PPAPAK 

KTALQRS PETPLE I CLVP VGPS PASPS PE PPVS KPVASS PTEQV 

PSQEMPLLARPS PPVQS VS PAVPTPP SMSAALPFPAGGLGMPPS 

LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 

CLPPPPTVPLVSGTPGAYAVFPTCSVPWAPPPAPVSPYSSTCTY 

GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 

GPPENVLP LS MAPPLSLGLPGHGAPQTEPTKVEVKP VPAS PHPK 

KKVSALVQSPQMKAIACVSAEGVTVEEPASERLKPETQETRPRB 

KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 

DWQAFISEIGIEASDLSSLLEQPEKSEAKKECPPPAPADSLAV 

GNSGGVDIPQEKRPLDRLQAPELANVAGLTPPATPPHQLWKPLA 

AVS LIiAKAKS PKS TAQEG TLKP EG VT3AKH PAAVRLQEGVHGPS 

RVHVG33DHDYC \VRSRTP PKK\MPAIiLI PEVGS RWNVKRHQD I 

TI KPVLSLGPAAPPPPC I AASREPLDHRTS SEOADP SAP CLAP S 

S LLS PEAS P CRNDMNTRT P P EPS AKQRSMRC YR KACRSAS PS SQ 

GWQGRl-tGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 

HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 

PSPRRRSDRRRRYSSYRSHDHYQRQRVLQKERAIEERRWFIGK 

IPGRI-4TRSELKQRFSVFGEIEECTIHFRVQGDNYGFVTYRYAEE 

AFAAI ESGH KLR Q ADEQ P FDIiCFGGRRQFC KRS YS DLDS NREDF 

D PAPVKS KFDS LD FDTLLKQAQKNIiRR 


5991 


334 


1379 


RLSSHFSQCS PS I YC\TKFDKQGNVTS FERKKTELYQELGIiQAR 
DLRFQHVMS I TVRNNRI IMRMEYLKAVITPECLLILDYRNLNLK 
QWLFR3LPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
SILQPLILETLDALGDPKHSSVDRSKI*HILLQNGKSLSELETDI 
. KI FKESIUBILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLEN YYRIiADDLSNAARELRVL IDDSQS 1 1 FINIiDSHRNVMM 
RIiNLQIiTMGTFSljSLFGLMGVAFGMNIiESSLEEDHRIFWLITGI 
MFMGSGIjIWRRLIjSFLGR/IiARSSIASYGMKDMVHGGIVEGIi 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKIiRRVL"'" 
SGQDDEEQGLTAQDSQINIi/SEVLDASSLSFNTRLKWFAICFVC 
GVFF S I LG TG L LW LPGG I KL FAVF YTLGNLAALAS TC FLMG P VK 
QLKKMFEATRLLATIVMLLCFIFTLCAALWWHKKGLAVLFCILQ 
FLS MTWYSLS YI PYARDAVI KCCS SLLS 


5993 


1650 


594 


AEGLGSWAVWAGLGWAGRHMEAGGATGAIiGVGCKLPSAFCFPGS 
SVAMDMFQKVEICIGEGTYGWYKAKNRETGQLVALKKIRIjDLEM 
EG VP S TAI RE I SLLKE LKH PNI VRLLD WHNERKL YLVFE FLSQ 
DliKKYT^DSTPGSELPIxHLIKSYLFQLLQGVSFCHSHRVIHRDIiK 
PQNLLINELGAI KLADFGLARAFGVPLRTYTHE WTLWYRAPEI 
LLATRFYTTAVDI WS IGC I FAEMVTRKALFPGDS \ E IDQ \ LFRI 
FRMLGT PS EDTWPG VTQLPD YKGS FP KWTRKGLEE I VPNLE P EG 

RDLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


S994 


394 

• 


1934 


AGE VQLH VWl RGMRI QPQ/ XAAA I IDhDPDFBPQSRPRS CTWPI, 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPGI LGAVTG PRKGGSRRNAWGNQS YAELISQAI ES APEKRI/T 
LAQ I YE WMVRT VP Y FKDKGDSNSS AG WKNS IRHNLSLHS KF I KV 
HNEATGKSSWWMLNPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGLPAP PEGATPTS P VGHFA KWSGS PCSRNR EEADMWTT 

FRPRSSSNASS VSTRLS PI>RPESEVLAEEI PAS VSS YAGGVPPT 
l^EGLELLDGLNLTSSHSLLSRSGIiSGFSLQHPGVTGPLHTYSS 
SLFS PAEG PLSAGEGCFSS SQALEALLTS DTP P P PADVLMTQVD 
PILSQAPTLLLLGGLPSSSK1ATGVGLCPKPLEAPGPSSLVPTL 
SMIAPPPVMASAPIPKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENLECDMDNI I S D LMDEGEGI*DFNFEPD P 


5995 


2 


2437 


RPPGPGPASGAV7LCTRARGSAAFVPPLPRPPSRGARRRRRLPGR— 
SVAALRRGPGSAPGIjPRGRAERSAAGSGRGPSREERGAAAAAAA 
kEMMEELHS L \ DP \ RRQELLEARF \TGLG VSKG PLNS ES SNQS L 
^SVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H«Histidinc, I«Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDYFERRVEQPLYGLDGSAAKEATEEQSALPTLMSVMLAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQISI 

qhrqtNqsdltiekisalensknsdlekkegriddllrancdlr 
rqi \deqqkmlekyk\ erlnrcfdneprnfli eks kqekmacrd 
ksmqdrlrlghfttvrhgasfteqwtdgyafqnlikqqerinsq 
reeierqrkmlakrkppamgqappatneqkqrksktngaenbtl 

TLAE YHEQEE I FKLRLGHLKKEEAE IQAELBRLERVRNLHIREL 
KRIHNEDNSQFKDHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RY VAVK IHQLNKNWRDE KKENYHKHACRE YR I H KELDHPRI VKL 
YDYFSLDTDSFCTVLEYCEGNDLDFYLKQHKLMSEKEARSIIMQ 
IVNALKYLNEIKPPIIHYDLKPGNILLVNGTACGEIKITDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPXISNK 
VDVWSVGVIFYQCLYGRKPFGHNQSQQDILQENTILKATEVQFP 
PKPWTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPHIRKSV 
STS S PAGAA I AS TSGAS NNS SSN 


599b 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS IWFGS IVNEGYLNSASEGEEFCIYNRNPNACS YGVAVGVL 
AFLTCLLYLAIiDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDAS PGRPSPFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP 


5997 


1612 


981 


DQQACIXGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEGYLNSASEGEEFC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDAS PGRPSPFS 
FFS I FTWSLTAALAVRRFKDLS FQEEYSTLFP\ASAQP 


S998 


1612 


961 


DQQACLLGLMLTLEFGILEFDPSWIGSWTUR/'SWVSWRSRPGC^ ' 
LFS I WFGS I VNEG YLNS AS EGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWF^TGDSCYL\ANQWQVS KPKDNPLNEGTDAS PGRPSPFS 
FFS I FTWSLTAALAVRRFKDLS FQEEYSTLFP \ ASAQP 


5999 

rfooo 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIVVVGFHHKKGCQVEFSYPPLIP 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFGI SCYR\ Q IEAKALKVRQAD ITRETVQKS VCVLS KLPLYG 
LLQAKLQL 1THAYFEEKDFSQIS ILKELYEHMNSSLGGASLEGS 
QVYLGLSPRDLVLHFRHKGLILFKLILLEKKVLFYXSPVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTNLGTIRKVMAGNHGEDAAMKTEEPLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLG SDQTNLFP KDS VPS E S LP I TVQPQANTGQ WLI PGL I SG LE 
EDQ YGM PLAI FT KG YLCLP YMALOQHHLLSD VTVRG FVAGATNI 
LFRCX)KHLSDAIVEVEEAI>IQIHDPELRKLLNPTTADLRFADYL 
VRH VTE'NKDD V FLDGTG WEGGDEW IRAQFAVY I HALLAATLQL V 

LFR1VNVAKKIGNVMVTT\SRNWQTGK\AVGQSVGGAFS\SAK 
TA\MS S WLSTFTTS TSQSLTEPPDEKP 


6001 " 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
DDFEDKVFYRTEFQNREFKATMCNLLAYLKRt/KGQNEAALECLR 
KAEBL I QQEHADQAS I RSLVTWGNYAW VY YHMGRLS DVQ I YVDK 
VKHVCEKFSSPYRISSPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
uKt>/\A \ Kit y K^KDEPDKAI ELLKKALE YIP \NNAYLHCQIGCCY 
RAKVFQVMNLRBNGMYGKRKLLELIGHAVAHLKKADEANDNLFR 
VCS 1 LASLHALADQYEDAE YYFQKEFS KELTPVAKQLLHLR YGN 
FQLYQMKCEDKA1HHFIEGVKINQKSREKEKMKDKLQKIAKMRL 

SKNGADSEALHVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 
SWNGE 




176 


1038 

1 
( 
« 


^AHSPSRGHKHTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 
^LHFDADGSGYLEGKELQNLIQELQQARKKAGLBLS PEMKTFVD 
2YGQPJDDGK1GIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
^RKYDTDHSGF I ETB ELKNFLKDLLEKANKTVPDTKLAE YTDLM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, ! 
L=*Leucine, M=>Methionine, N~Asparagine, 
P*Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LKLFDSNWDGKLELTEMARLLPVQENPIiLKPQGIKMCGKEPNKA 
PELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 
MALSDGGKL YRTDLAL I LCAGDN 


j 6002 


I 977 


81 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLHPHS 
SMAQRSDLLEU3CQLTRDRWWSHDENLCRQSGLNRDVGSLDP 
EDLPLYKEKLEVYFSPGHFAHGSDRRMVRLEDLFQRPPRTPMSV 
EIKGKNEELIREQ/VX,VRRYDRWEITIWASEKSSVMKKCKAANP 
EMPLSFTISRGFWVLLSYYLGLLPFIPIPEKPFFCFriPNIINRT 
YFPFS CS C LNQLLAWS KWL IMRKSL I R KLEERGVQ WFWCLNE 
ESDFEAAFSVGATGVITDYPTALRHYLDNHGPAART5 


6003 
6004 


140 


4098 


GKLiRAFRGMRRLI C KR I CD YKS FDDEE S VDGNRPS S AAS AF KVP 
AP KTSGNPANSARKPGSAGG PKVGAGAS KEGGAGAVDEDDF I KA 
K-rOVPSIQIYSSRELEETLNKIRErLSDDKHDWDQRANALKKIR 

sllvagaaqydcffqhlrlldgalklsakdlrsqwreacitva 
klstvlgnkfdhgaeaivptlfnlvpnsakvmatsgcaairfii 

RHTH VPRLI PL ITSNCTS KSVP VRRRS FE FLDLLLQE WQTHSLE 

rhaavlvetikkgihdadaearvearktymglrnhfpgeaetly 

NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVNAAAGAK 
AHHAAGQS VR SGRIXSAGAliNAGS YASLEDTSD KLEX3TAS EDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALS TVSS GVQR VL VNS AS AQKRS KI PRS QG CS REAS P SRLSV 
ARS S R X PRPS VSQG CS REASRE S S RDTS P VRS FQ PLAS RHHSRS 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVIiNTCSDVEEA 
VADALLLGD IRTKKKPARR RYES YGMHSDDDANSDASS ACS ERS 
YSSRNG5IPTYMRQT\EDV\AEVIiNRCASSNPfSERKEGLLGLaN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDIiQDWLFVLLTQLLKP^ADLLGSVQAKVQKAIiDVTRES 
FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 
FINS S ETRLAVS RVTTWTTE PKS S D VRKAAQS VL I S LFE LNT PE 

FTMI.LGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 
RS PAN WS S PLTS PTNTS QNTLS PSAFD YDTENMNS ED I YSS LRG 
VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMOGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
P FNKSALKE AMFDDDADQFPDDLSLDHSDLVAELL K E I iSNHNER 
VEERKIAIiYELMKIiTQEESFSVWDEHFKTILLIiLLETLGDKEPT 
IRAI^KVLREII^HQPARFKNYAELTVMKTLEAHKDPHKEVVR 
SAEEAASV\LATSI\SPEQCIKVLCPIIQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCIiVAV 
HAVIGDELKPHLS QLTGS KMKLLNL Y I KRAQTGSGGADPTTDVS 
GQS 




140 


4096 

1 

] 
( 
I 
I 


GKLRAFRGMRRL I CKR I CD Y KS FDD EES VDGNR P S S AAS A"FKVP 
APKTS GN P AJJSARKPGS AGGP KVGAGAS K EGGAGAVDE DDF I KA 
F TDVPS I Q I YS S R ELEETLNKI RE ILS DDKHDWDQRANALKK1 R 
SLLVAGAAQ YDCFFQHLRLLDGAIiKLS AXDLRSQWREACI TVA 
HLSTVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRL I PLITSNCTSKS VP VRRRS FE FLDLLLQEWQTHS LE 
RHAAVLVETI KKG I H DADAEARVEAR KT YMGLRNH FPG EAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPS TVAGRVS AG SS KAS S LPGS LQRSRSDI DVMAAAGAK 
AHHAAGQSVRSGRLGAGATJWGSYASIjEDTSDKLDGTASEDGRV 
f^^^^rt^j^^mijJAiAKAlJ&KGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVS SGVQRVLVNSASAQKRSKIPRSQGCSREAS PSRIiSV 
ARSSRIPRPSVSQGCSREAS RES SRDTSPVRSFQPLAS RHHSRS 
rGALYAPEVYGASGPGYGISQSSRIiSSSVSAMRVLNTGSDVEEA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
if S SRNGS I PT YMRQT\EDV\AB VXjNRCASSNWS ERKBGLLGIiQN 
jLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
3VHKDDU3DWLF\OiTQLLKKMGADLLGSVQAKVQKALDVTRES 
7 PNDLQFN I LMRFTVDQTQTP SLKVKVAI LKY r ETLA KQMDPGD 
'INSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFEIjNTPE 
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SEQ~ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 
1 residue of 

amino acid 
1 sequence 


1 Predicted end 
j nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
j sequence 


Amino acid segment containing signal peptide — 
(A»Alanine, C«=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K- Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P*Proline, Q=Glutamine, R«Arginine, 
S»Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








FTMLLGALP KTFQDGATKLLHNHLRWTGNGTQSSMGS PLTRPTP 
RSPANWSSPLTSPTNTSQNTLSPSAPDYDTENMNSEDIYSSLRQ 
VTE AIQNFS FRSQEDMNEPLKRDS KKDDGDSMCGGPG \MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSAIiKEAMPDDDADQFPDDLSLDHSDLVAEIiLKELSNHKER 
VEERKIALrELMKLTQEESFS VWDEHFKTI LLLLLETLGDKEPT 
IRALALKVLREILRHQPARFKNYAELTVMKT^EAHKDPHKEVVR 
SAEEAAS V\LATS I \SPEQCIKVLCPI IQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHL.SQLTGSKMKLLNLYI KRAQTGSGGAD PTTDVS 
GQS 


1 6005 


j 133 


| 5 9SS 


RS SGRRQEQLGQ FPGRER KGMAS GLGS PS PCSAGSEEEDMDALL" H 
NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 
KRQKKERMLLCRQLGDS SGEG PE F VEEE EE VALRS DS EGS DYT P 
GKKKKKKLGPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLL 
EDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 
KMMMVLGAKWR3 FS TNNP FKG S SGAS VAAAAAAAVAWES M VTA 
TE VAP P P P P VE VPI R KAKTKEGKGPNARR KPKGS PRVPDAKKPK 
PKKVAPLKIKLGGFGSKRKRSSSEDDDLDVESDFDDASINSYSV 
SDGSTSRSSRSRKKLRTTKKKKKGEEEVTAVDGYETDHQDYCEV 
CQQGGEI ILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 
QWEAKEDNS EGEE IL B E VGGDLEEEDDHHME FCR VCKDGGELLC 
CDTCPSSYHIHCLNPPLPEIPKGEWLCPRCTCPAIiKGKVQKILI 
WKWGQPPSPTPVPRPPDADPNTPSPKPIiEGRPERQFFVKWQGMS 

Y WHCS W VS E LQLE LHC \ QVM FRNYQRKNDMD EP PSGDFGGDEEK 
S\RKRKNKDPKFAEMBERFYRYGIKPEW\MMIHRIIiNHSVDKKG 
HVHYLI KWRDIjP YDQAS WESEDVEI QDYDLFKQS YWNHRELMRG 
EfiGRPGKKLKKVKLRKI»ERPPETPTVDPTVKYERQPEYLDATGG 
TLHPYQMEGLNWLRFSWAQGTDTIIiADEMGLGKTVQTAVFIjYSIi 

Y KEGHS KG P FL VSAPLS T I IN\WEREFEMWAPDMYV\VTYVGDK 
DSRAI I RENEFS \ FEDNAI RGGKKASRMKKEASVKFHVLLTS YE 
LITIDMAILGSIDWACLIVDEAHRLKNNQSKFFRVLNGYSIiQHK 
LLLTGTPLQNNLEELFKLLNFLTPER FHNLEGFX.ee FAD I AKED 
Q I KKLHDMLG \ PHMLRR L KAD V FKNM PS KTEL I V\ RVELS P M \ Q 
KKYYK\YILHSKFIjKALN\ARGGGNQVSLLNVVMDLKKCCNHPY 
LFPVAAMEAPKMPNGMYDGSALIRASGKIiLLLQKMLKNIiKEGGH 
RVLIFSQMTKMLDIiLEDFLEHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAQQFCFLIiSTRAGGLGINLATADTVI I YDSDWNPHNDIQ 
AFSRAHR IGQNKKVMI YRFVTRASVEERITQVAKKKMMLTHLW 
RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 

ssvihyddkaierlldrnqdetedtelqgmnbylssfkvaqyw 
reebmgeeeevere 1 1 kqees vdpdywekllrhhyeqqqbdlar 
nlgkgkrirkqvnyndgsqedrdwqddqsdnqsdysvaseegde 
dfderseaprrpsrkglrndkdkplppllarvggnievlgfnar 
qrkaflnaimrygmppqdafttqwlvrdlrgksekefkayvslf 
mrhlcepgadgaetfadgvpreglsrqhvltrigvmslxrkkvq 
efehvngrwsmpelaeveenkkmsqpgspspktptpstpgdtqp 
ntpapvppaedgikieenslkeeesiegekevkstapetaiect 
qapapasedekvweppegeekvekaevkerteepmetepkgkg 

AADVEKVEEKSAIDIjTPIWEDKEEKKEEEEKKEVMLQNGETPK 

di^ekqkknikqrfmfniadggftblhslwqneeraatvtkkt 

VPTWUD DUl^VWY T Ti r~ t TtTTTmrn i~ir t_t-_ h-_ i_ ^ M . _ 

xtui WHRKHDYWLLAGI inhgyarwqd iqndpryailmepfkgem 
nrgnfleiknkfiarrfklleqalvieeqlrraaylnmsedpsh 
psmalntrfaeveciaeshqhlskesmagnkpanavlhkvlkql 
eellsdmkadvtrlpatiari ppvavrlqmsernilsrlanrap 
bptpqqvaqqq 


j 6006 


1 


965 ] 
< 
1 
( 
1 


ONDFLRNT\mRHEPPVrAEPIR]XAENEDWVVDKPSSlPVHPC 
3RFRHNTVIFILGKEHQLKELHPLHRLDRLTSGVI,MFAKTAAVS 
3R IHEQ VRDRQLEKE Y VCR VEGEFPTE EVTCKEP 1 1» WS YKVG V 
^VDPRGKPCETVFQRLSYKGQSSWRCRPLTGRTHQIRVHLQF 
.GHPILNDPIYNSVAWGPSRGRGGYIPKTNEEGLRDLVAEHQAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. FsPhpnvl arti r\f> cz—fz~\ \ r>*± 

H=Histidine, I=*Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, tt=Asparagine , 
P=Proline, Q=Glutamine, Rt=Arginine, 
S^Serine, T»Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 










QSLDVU5LCEGDI^PGI,TDSTAPSSELGKDDLEELAAAA\QKMe'" 
B VAEAAPQ ELDTI ALAS EKA VETDVMNQ \ RQT \ TLCR V PAGATG 
SLAPRPCDVPTCPTL 




6007 
" *006 


3 


! 2351 


HELGQVEYVFTDKTGTiTENEMQFRECS INGM KYQE I NGRLVP E " 

GPTPDSSEGNLSYLSSLSHLNNLSHLTTSSSFRTSPENETELIK 

EHDLPFKAVSLCHTVQINNVQTDCTGDGPWQSIJLAPSQLEYYAS 

SPDEKALVEAAARIGIVFIGNSEETMEVKTLGKLERYKLLHILE 

FDSDRRRMSVIVQAPSGEKLLFAKGAESSILPKCIGGEIEKTRI 

HVDEFALKGLRTLCIAYRKFTSKEYEEIDKRIFEARTALQQR\E 

EKIiAAVFQFI EKDLI LLGATAVEDRLQDKVRETI EALRMAG I KV 

w v u i icu kji t x a v b v S JUS CGHFHRTMN I LE L I NQ KSDSE CAEQLR 

QLARRITEDHVIQHGLWDGTSLSLALREHEKLFMEVCRNCSAV 

LCCRMAPLQKAKVIRIjIKISPEKPITLAVGDGANDVSMIQEAHV 

G I G I MG KEGRQAARNSDYAI AR FKFLS KLLF VHGHFY Y I R I ATL 

VQYFFYKNVCFITPQFLYQFYCLFSQQTLYDSVYLTLY\NICFT 

SLP I LI YSLLEQHVDPHVXiQNKPTLYRDIS KNRLLS I KTFL YWT 

ILG FS HAFI FFFG S YLL I G KDTS LLGNGQMFGNWTFGTL VFT VM 

VIT\n*VX^LECTIFWTWlI^VTWGSIIFYFVFSLFYGGILWPF 

LGSQNMYFVF IQLLSSGSAWFAIILMWTCLFLDI IKKVFDRHL 

HPTSTEKAQLTETNAGIKCLDSMCCFPEGEAACASVGRMLERVI 

GRCSPTHI SRS WSASDPFYTNDRS ILTLSTMDSSTC 






4554 


1089 


A3VRRAGARRGPGRALPAGATAVP PPSARRRRRCPAPEHAG PAR 
ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 
DFKQFE PNDF YLKNTTWEDVGLWDPS LTKNQD YRTKPFCCSACP 
FSSKFFS A YKSH FRNVHS E D FENR I LLNCP YCTFNAD KKTLETH 
IKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSVEQAVYYCK 
KCTYRDPL YE I VRKHI YREHFQHVAAP Y I AKAGE KS LiNGAVP LG 
SNAREESSIHCKRCLFMPKSYEALVQHVIEDHERiaYQVTAMIG 
HTNVWPRSKPLMLIAPKPQDKKSMGLPPRIGSLASGNV\RSLP 
SQQMVNRLS I PKPNLNSTGVNMMSS VHLQQNN YG VKS VGQGYS V 
GQSMRLGLGGNAPVS I PQQSQS VKQLLPSGNGRS YGLGSEQRSQ 
APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 
ATGPPPGNTSSTQKWKICTICNELFPENVYSVHFEKEHKAEKVP 
AVANY I MK I HNFTS KCL YCNR YLPTDTL LNHML I HG LS CP YCRS 
TFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDLTLQQGSHTNIH 
LLVTTYNLRDAPAESVAYHAQNNPPVPPKPQPKVQEKADIPVKS 
SPQAAVPYKXDVGKTLCPLCFS I LKGP ISDALAHHLRERHQVI Q 
x. v nrvciviu4 1 x j. n^UaV x I £>N MTAS TIT LHLiVH CRG VG KTQN 
GQDKTNAPSRLNQS PSLAPVKRTYEQME FPLLKKRKLDDDSDSP 
S FFEEKP EEP WLALDPKGH \ E DDS YEARKS FLTKYFT \ KQP YP 
TRREIEKLAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
LGFNXKELNKVKHEMDFDAEGLFENHDEKDSRVNAS KTADKKLN 
LGKEDDSSSDSFENLEE2SNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEDASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 
EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDE 
S S QS EDAR S S KPAAKKKATMQGDREOLK WKNSS YGKVEG F WS KD 
QSQWXNAS ENDERLSNPQI EWQNSTIDS EDGEQFDNMTDGVAE P 
MHGS LAGVKL S SQQA 


6009 


4272 


1534 

* 


CHGLQHLTPFREr J NLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC 
HLVLGVLVPVARQSSHSAGPAQSAFR*TGTGSGTPKAAEQSGYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAG EQ AS QRRT VFTAGGGECLGAKS VRAS VFTGNQ PG VMG LL 
NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCRIHIVDAVC*SEHH*DHFLAAAFLENSTIIS*VAPGSWQDHA 
VLQKEVQASVRCRGFESVDTAPAGFWAHSPPGLQGEPTTTSVSL 
FVLAPQDGEG V PFVEGQL VTVLG LWP QS I RHTFVHHTQLFLHP 
I * KLGALDVAF LHLLTLVCS S FNVAYG * GKNGGTT LHQLFAE VN 
^VTRGSAVQRRPS IT1SS IHVDTKIQQELHDVMVAGADGWQWG 
3PFWGLAGIFHLIDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 
^PLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
IDLLRGGDRGHWVI VLCRLGSLVGGLGTDBLLWFGGR » L, 1 1 IG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - * 
(A«Alanine, C~Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HsIIistidine, I^Isoleucine, K^Lysine, 
L-Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T-Threonine, V-Valine ( 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\*=possible nucleotide insertion) 








I * * RGRLSGE WGCGLGRGELFQVS IGIG VS IVHIGQGDHEVLGG 
AGL VERGALHATGQGVEALVQQLLDVGPAGALGLCDGAALFQG P 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CG VGGAI LLKALSQ Y FL KGG * RL WCARGQ * P VKKRQRRWRG * TR 
R * NGLTI H CFN * L I * GAVCCRLVI LR WCGLLBVHG VYGT * I HCL 
GS FPGRLWP+ PFI SQERPNGHCQWE FRLAVPSWKCR WSRWRVRG 
TWR YGN PLLNLL * GAWLGG AACGGQQGG PLS TWQACTGPGQAAF 
LP P FQGACRPRTQRCRTWVCPI AWRQLLAYTRD 


6010 


1 


3533 


I MP CGS S RLLRGCWTHPN E P VS DLS YFDC IES VMENS KVLGESM " 
AG I SQNAKTGDLPAFGECVGI AS KALCGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVDP IQFARANQAI QMACQNLVDPGSSPSQVLSAA 
TIVAKHTSAtiCNACRIASSKTANPVAKRHFVQSAKEVANSTANL 
VKT I KALDGD FS EDNRNKCRI ATAPL I EAVENLTAFASNPEFVS 
I PAQI S S EGS QAQE P I LVSAKPMLE S S S YLI RTARS LA INP KDP 
PTWS VLAGHS H ? VS DS I KS L I TS I RDKAPGQRE CD YS IDG INRC 
IRDIEQASLAAVSQSIiATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFEPL I LAAVG VAS KILDHQQQM 
TVLD QTKTLAE SALQML Y AAKEGGGNP KAQHTHDAI TE AAQLMK 
EAVDD IMVTI.NEAASEVGLVGGMVDAI AEAMS KLDEGTPPE PKG 
TFVDYQTTVVKYSKAIAVTAQEMMTKSVTNPBELGGIASQMTSD 
YGHLAFQGQMAAATAEPEE I G FQ I RTR VQDLGHGC I ELVQKAG\ 
ALQVCPTDS YTKRE LI ECARAVTEKVSL VLS ALQAGNKGTQACI 
TAATAVSGI I ADLDTTI MFATAGTLNAENSETFADHREN I LKTA 
KALVEDTKLLVSGAAST PDKLAQAAQSSAATI TQLAEWKLGAA 
oiAjoL/x/rDiyv vuiM/iJ, Ai^VAj\AbaDljXSATKGAASKPVDDPSM 
YQLKGAAPCVMVTNVTSLLKT VKA VEDEATRGTRALEAT I EC I KQ 
ELTVFQS KDVPEKTSSPEES IRMTKGI TMATAKAVAAGNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TIXTYLDLLEHVLVI^QKPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPBDPTVIAETELLGAAASIEAAAKKLEQLKPRAK 
PKQADETLDFEEQILEAAKSIAAATSALVKSASAAQRELVAQGK 
VGS I PANAADDGQWS QGLI SAARMVAAATSSLCEAANASVQGHA 
SEEKLISSAKQVAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDD VWKTKFVGG I AQI IAAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6011 


446 


1835 


LLQPAMRKSPGLSDCLWAWI LLLSTLTRB cvrriDCTrinuT trrwii — 

TVFTRI LDRLLDGYDNRIjRPGLGERVTEVKTDI FVTSFGPVSDH 

DMEYTI DVFFRQSWKDERLKFKGPMTVLRLNNLMAS KIWTPDTF 

FHNGKKS VAHNMTMPNKLLR I T3DGTLLYTMRLT VR \AE CPMAF 

GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSVWAED 

GSRLNQYDLLGQTVDSGIVQSSTGEYVVMTTHFHLKRKIGYFVI 

QTYLPC IMT VI LSQVS FWLNRES VPARTVFG VTTVLTMTTLS I S 

ARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYA 

WIX5KSVVPEKPKKVKDPLIKKWNTYAPTATSYTPNLARGDPGLA 

TIAJCSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 

GIFNLVYWATYLNREPQLKAPTPHQ 


6012 


351 


S013 

) 
J 


PAELFQSFAIWHKELYDWRLGPWNQCQPVISKSLEKPLECIKGE 
EGIQVREIACIQKDKDIPAEDIICEYFEPKPLLEQACLIPCQQD 
CIVSEFSAWSECSKTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
QVCQSSPCEAEELRYSLHVGPWSTCSMPHSRQVRQARRRGKNKE 
REKDRSKGVKDPEARELIKKKRNRNRQNRQENKYWDIQIGYQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCSKTCHDMVSPAGTRVRTRTIRQFPIGSEKECPEFEEKEPCLS 
QGDG WP CAT YGWRTTE WTECRVD PLLS QQD KRRGNQTALCGGG 
I QTREVYCVQANENLLS QLSTHKNKEAS KPMDLKLCTGP I POTT 
QLCHIPCPTECEVS PWS AWGPCT YENCNDQQGKKGFKLRKRR IT 
NE PTGGSGVTGNCPHLL2AI PCEEPAC YDWKAVRLGDCE PDNGK 
BCGPGTQVQE WCINSDGEEVDRQLCRDAIFP I PVACDAPCPKD 
CVLS TWS TWS S CSHTCSGKTTEG KQI RARS I LAYAGEEGG I RCP 
^3SALQEVRS CNEHPCTVYHWQTGPWGQCIEDTSVSSFNTrTTW 
^GEASCSVGMQTRKVI CVRVNVGQVGPKKCPES LRPETVRPCLL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A*=Alanine, C= Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G~Glycine, 
H»Histidine, I-Ieoleucine, K=Lysine, 
L= Leucine, M«=Methionine, N=Asparagine , 
P=Proline, Q^Glut amine, R=Arginine, 
S»Serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion* 










PCKKDCIVTPYSDWTSCPS\SCKEGDSSIRKQSRHRVIIQLPAN~ 

GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 

VQQDS P\GAQEGCG PGRQARAI TCRKQDGGQAGI HECLQ YAG P V 

PALTQACQI PCQDDCQLTSWS KFSSCNGDCGAVRTRKRTLVGKS 

KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 

VLLGMKVQGDIKECGQGYRYQAMACYDQNGRLVETSRCNSHGYI 

EEACIIPCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLREKPYN 

GGR P C PKLDHVNQAQ VYE WPCHSDCNQ YLWVTEPWS I CKVTF V 

NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPBEMPLGSR 

VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 

GRS CPNAVEKEPCNLNKNC YHYD YNVTD W S TCQIjSEKAVCGNG I 

KTRM LDC VRS DGKS VDL K YCEALG LEKNWQMNTS CMVECP VNCQ 

LSDWSPWSECSQTCGLTGKMIRRRTVTQPFQGDGRPCPSLMDQS 

KPC PVKPCYRWQYGQWS PCQVQEAQCGEGTRTRNISCWSDGSA 

DDFSKWDEEFCADIELI IDGNKNMVLBESCSQPCPGDCYLKDW 

SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQEIjENQHLCPEQML 

ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 

SQPDADRSCNPPCSQPHSYCSBTKTCHCEEGYTEVMSSNSTLEQ 

CTLIPVVVLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFLQ 

PFGPDGRLKTWVYGVAAGAFVLLIFIVSMIYLACKKPKKPQRRQ 

NNRL KP LTLAYD GDADM 




6013 


1161 


710 


GAFI AG VP VQ P VL I R YPN S LDTTS WAWRG PG VLKVLWLTASQP C 
S I VDVE FLP VYHPSP E ESRDPTL YANNVQR VMAQALG 1 PATECE 
F VGS L PVI WGRLKVALE PQL/ WGTGKSASEGWAVRKLCGRWGR 
ARPESNDQPGRVCQAATAli 




6014 


2857 


613 


EAVAGGME KS RMNLPKGPDTLCFDKDEFMKEDFD VDHF VS DCRK 
RVQLEELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLVSM 
DKALNQLSVPLGQLREEVLSIiRSSVSEGIRAVDERMSKQEDIRK 
KKMCVLRLIQVIRSVEKIEKILNSQSSKETSALEASSPLLTGQI 
LERI ATEFNQLQFHACQS K\GMPLLDKVRPR I AG ITAMLQQSDE 
GL LLEG LQTS DVD 1 1 RHCLRTYAT I DKTRDABAL VGQVTjVKP Y I 
DE V 1 1 EQ F VESH PNGLQ VM YNKIiLE FVPHHCRLLRETVTGGA I SS 
EKGNTVPGYDFIjVNSVWPO I VOGI*E EKLPSLFMPGNPnAFHRirv' 

tismdfvrrlerqcgsqasvkrlrahpayhsfnkkwnlpvyfqi 

RB'RE I AGSLEAAIiTDVIjEDAPAES P YCLIiASHRTWSSLRRCWSD 

emflpllvhrlwrlhsgrfwarysvfv\n\elslrpisnespke 

IKKPLVTGSKEPS ITQGNTEDQGSGPSETKPWS ISRTQZjVYW 
ADLDKLQEQLPELLEI I KPKLEMIGFKNFSS I SAALEDSQS s fs 

acvpslsski iqdi*sdscfgflksalevprlyrrtnkevpttas 

S YVDS ALKPLFQLQSGHKDKLKQAI I QQWIiEGTliS ESTHKYYET 
VSDVLNSVKKMEESLKRIiKQARKTTPANPVGPSGGMSDDDKIRIi 
QLALDVE YLGEQ rQKLGLQASDIKS FSALAELVAAAKDQATAEQ 
P 


- 


"~6"015 

> 

6016. 


13 
13 


2237 


aegcaerrgtepwelsmswesgagpglgsqgmdlvwsawygkc^ 
vkgkgsiiplsahgiwawlsraewdqvrvylfcddhklqryaln 
ritwjrsrsgnelplavastadlircklldvtgglgtdelirlly 
gmalvrfvnliserktkfakvplkclaqevnipdwivdlrhelt 
hkkmphindcrrgcyfvldwiiqktywcrqlefslretweleefr 
egieeedqeedknivvdditeqkpbpqddgkstesdvkadgdsk 
gsee vds hck kaiis h kel yerarell vs yeeeqftvle k fr ylp 
kai kawnnps prvec viiaelkgvtcenreavldaflddgflvpr 
feqlaalqieyeenvdlndvlvpkpfsqfwqpllrglhsqnftq 
allermlselpalgisglrptyiiirwtvelivantktgrnarrf 
sagq wearrgwrlfncsas ldwprmvesclgs pc was pqllr 1 1 
f\kamgqglqde\eqekllri cs i ytqsgens lvqegseas p ig 
kspytldslywsvkpasssfgseakaqqqeeqgsvndvkeeeke 
ekevl pdq veeee enddqeeee ededdeddeeedrme vgp fs tg 
qes ptaenarllaqxrgalqgsawq vss edvrwdtfp\lgrm p r 
srprtpaelmlenydthvifwtkpvlXeqrlepstckXtdtlgl 

\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQLF 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



203 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, Dispart ic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
| H-Histidine, I=Isoleucine , K= Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S»Serine, ^Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion. 



3469 



6018 



6019 



13 



2510 



1066 



\=possible nucleotide insertio n) 
VKGKGSLPIiSAHGIWAWLSRAEWDQVTVYLPCDDHKLQRYALN 
RITVWRSRSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLLY 
GMALVRPVNLISERKTKPAKVPLKCLAQEVNIPDWIVDLRHELT 
HKKMPKINDCRRGCYPVLDMIiQKTyWCRQLENSLRETWEIiEEPR 
EG I EEEDQEEDKN IWDDI TEQKPEPQDDGKSTESDVKADGDS K 
GSEE VDSHCKKALS HKE LYERARELLVS YEEEQFTVL EKFR YL P 
KAI KAKNNPS PRVECVtAELKGVTCENREAVLDAFLDDGFLVPT 
FEQLAALQI E YEENVDLNDVL WPKPFSQFWQPLLRGLHSQHFTQ 
ALLERMLS EL P ALG I S G I RPT Y I LRWTVE L t VANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDVJPRMVBSCLGSPCWASPQLliRII 
F \ KAMGQGLQDE \ BQEKLLRICSI YTQSGENS LVQEGS EAS PIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
E KE VX»PDQ VE EEEBNDDQ EE BEE DEDDE DDEK EDR MEVGP FSTG 
QESPTAENARLLAQKRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGL QLF 
HHQE IEQNSAMAPRKRGGRGXSF I FCCFRNNDHPEITYRLRNDS 
NFALQ1WB PALPMP PVEEIjDVMFSELVDELDLTDKHREAMFALP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDQLKSMAARKSLLAL 
E KE EEEERS KT I ESIjKTALRTKPMRFVTR F I DLDGLS C I LNFLK 
TMDYETSESRIHTSLIGCIKALMNNSQGRAHVLAHSBSINVIAQ 
SLSTENIKTKVAVLEILGAVCLVPGGHKKVLQAMLHYQKYASER 
TR FQTLINDLDXSTGRYRDE VS LKTAIMS FI NAVLSQGAGVESL 
D FRLHLR YE \ FLMLG I H P VMDKLRKHENSTLDRHLDFFEMLRNE 
DELE FAKRFE LVHIDTKSATQMFELTRKRIiTHSEAYPHFMS ILH 
H CLQMP YKR S GNTVQ Y WIiLLDR 1 1 QQIVIQHDKGQDPDS TPLEN 
FNIKNWRMLVNENEVKQMKEQABKMRKEHJNELQQKLEKKEREC 
DAKTQE KEEMMQTLNKMKE KLEKBTTEHKQ VKQQVAELTAQLHE 
LSRRAVCASIPGGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGM 
LPPPPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALKKKSIPQ 
PTNALKSFNWSKLPENKXEGTVWTEIDDTKVFKILDLEDIiERTF 
SAYQRQQDFFVNSNSKQKEADAIDDTLSSKLKVKELSVIDGRRA 
QNCN I LLS RLKLSNDE I KRAI LTMDE QEDLPKDMliEQLLKF VPE 
KSDIDLLEEHKHELDRMAKADRFLFEMSRINHYQQRDQSLYFKK 
KFAEEVAEVKP KVEATRSGSEEVFR5 GALKQLLEVVLAFGNYMN 
KGQRGNA YGFK I S S LNK IADTKS S I D KNI TLLHYL I TI VEN K YP 
SVLNLNEELRD I PQAAKVNMTELDKE I STLRSGLKAVETELE YQ 
KSQP PQPGDKF VS WSQ PI T VAS FS FSDVEDLLAEAKDLFTXAV 
KHFGEEAGKIQPDEFFG IFDQFLQAVSEAKQENENMRKKKEEEE 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKRITNQMTDSSRERPITKL NF 

I TISQSGGIRRRREAVWFEWNMDFSRLHMYSPPQCVPENTGYTY 
ALSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRLATTACTLGD 
GEAVGADSGTSSAVSLKNRAARTTKQRRSTNKSAFSIiVHVSRQV 
TSSGVS YGGTVSLQDAVTRRPPVLDES WI REQTTVDHFWGLDDD 
. GDLKGGNKAAIQGNGDVGAGAATGHNGFPCSNCNMLS2RKDVLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
, ACAG Y FLLQ ILRR IGAVGOAVS RTAWS ALWLAWAPG KAASG VF 
I WWLGIGWYQFVTLISWLNVFLLTRCLRNICKFLVLLIPLFLLLG 
LSLRGQGVNFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 
PLQGDSEAFPWHWMSGVEQQVASLSGaCHHHGENLRELTTLLQK 
LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAPHQEHEVRMS 
I HLEDILGKLREKSEAIQKELEQTKQKTISAVGEQLLPTVEHLQL 
. ELDQL KS ELS S WRHVKTGCETVDAVQERVD VQVREMVKLL FS ED 
QQGGSLEQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 
TKQIiPTS EAWSAVSE AG ASG I TEAQARAI VNS AJbKL YSQDKTG 
M VDFALES GGG S ILS TRCS ETYETKTALMS LFGI PLW YFS QS PR 
WIQPDI YPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHI PKTL 
! S PTGN I S SAP KDFA V YGI.ENE YQK EGQLLGQ FT YDQDGES LQMF 
QALKRPDDTAFQI VELR I FSNWGHPE YT CLYRFR VHOE P VK 

TPMnPPPPDnDDPgf»nr>Vr»ttT >«r.T»rrt» ■» : 



TPNDREPPPuKPPSSRRASHIAQEITS AASLGPQTQILGSLTTA 
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ID 
NO: 


ir re dieted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H~Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N^Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6020 






PVTTSAIRSMPGISSQILTNAQGQVIGTLPWVVNSASVAAPAPA 
QS LQVQAVTPQLLLNAQGQVI ATLAS S PLP PP VAVRK\ PS TPES 
LLKSEVQPIKPTPTVPQPAWIASPAPAAKPSASAPIPITCSBT 
PTVSQLVSKPHTPSLDEDGINLEEIRBFAKNFKIRRLSLGLTQT 
QVGQALTATEGPAYSQSAICRFEKIiDITPKSAQKLKPVLEKWLN 
EAELRNQEGQQNI*MEFVGGEPSKKRKRRTSFTPQAIEALNAYFE 

KNPLPTGQBITEIAKELMYDREWRVWFCNRRQTLKNTSKLNVP 
QIP 


"6021 


4953 


549 


EAIQFEVSiGNYGNKFDlTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPWTLTSYWEDISHRLDAVNTLLAMAERLrQTNIEALKSGI 
QG K I PAN QLAEL WLKL I DEVI EDTR YT L P IiTEG KANVT VLDTQ I 
R KLRSR S L S Q I HEAAVRMRS E ATD VKSTLAE I E DWLDKLMQLTE 
EP QNSKPD III WM I RGE KRLAYAR I PAHQ VL YS TSGENASG KY C 
GKTQTI FLKY PQE KNNGPKVPVELRVN I WLGLSAVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLP ? KG WE WEGEW I VDPERSLLTEADAGHTE FTDEVYQNESR YP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTPIVSCKFDRDYIYHLRCYVYQARNLIALDKDSFSDP 
YAH I C FLHRS KTTE I I HSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRSIFSPWKLN9EMDITPKLLW 
HP VMNGDKACGDVLVTAEL I LRG KDGSNLP I LP PQRAPNL YMVP 
QG I R PWQLTAI EI LAWGLRNMKNFQMAS I TS PS L WE CGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKV^DHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
BIVIEMEDTKPIiLASKCLSSMSTALSKMASPATVHDTEKEEEIV 
DWWSKFYASSGEHSKCGQYIQKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFREIj PDS VPQECTVR I Y I VRGLELQ PQDNNGLCDPY I K I TLG 
KKVI E \ DRDHY I PNTLNP VFGRMYELS CYLPQEKDLKISVYD YD 
TFTRDEECVGBTIIDLENPF\LSRFG\SHCG\ IPEEYCVSGVNTW 
RDSIiR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKHiHOHLGAPEERLALHILRTQGIiVPEHVETRTLHSTFQP 
NI S \ R Y YLR V 1 INNTKD V I LDEKS ITGEEMSDI YVKGWI PGNEE 
NKQKTDVHYRS LDGEGNFNWRFVFP FDYLPAEQLC I VAXKEH FW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC / RGLDM I PDLKAMNPLKAKTAS LFEQ KSMKGWW 
PCYAEKDGAR VMAG KVEMTLE I LNEKEADERPAG KGR DEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 




4953 


549 [ 
I 

: 

i 

i 
i 

c 


EAIQFEVS IGNYGNKFDTTCJCPLASTTQ YSRAV b'DGti ¥ YYYLPW 
AOTKPVVTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSG I 
QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKAWVTVLDTQI 
RKLRS RS LSQIHEAAVRMRSE ATD VKSTLAE I EDWLDKLMQLTE 
EPQNSMPDI 1 IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQE KNNGPKVPVE LRVN I WLGLSAVE KKFNS FAE 
3TFTVFAEMYEN QALMFGKWGTSGLVGRHKFS D VTG KI KLKRE F 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEPTDEVYQNESRYP 

3GDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDHAWSYDINR 
WDEKGWEYGITIPPDHKPKSWVAAEKMYWTWPooot vn ynwim 

CTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
*WRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
rT VFGANTP I VS CNFDRD YI yhlrcyvyqarnl laldkds fsdp 
f AHI CFLHRS KTTE I IHSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
JPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
rPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPNLYMVP 
}G IR PWQLTA IE I LAWGLRNMKNFQMAS ITS PSLWECGGERV 
SWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
X3RKP WGQ CT IERLDR FRCDP YAG KED I VPQLKAS LLSAPPCR 
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1 SEQ 
ID 
j WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predxcted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K=L.ysine, 
L=Leucine, MoMethionine, N«Asparagine, 
P^Proline, Q=Glubamine, R=Arginine, 
SsSerine, T«Threonine, V=Valine, 
W«Tryptophan, Y-Tyrosine, X= Unknown, *=3top 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion; 








DIVIEMEDTKPLLASKCbSSMSTALSKMASPATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYWCELENVAEJEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKrTLG 
KKV IE\ DRDHYI PNTLNP VFGRM YELS CYLPOP Km ,tf T ^/vnvn 

TFTRDEKVGETIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
R DSLR \ PTQ \ LLQNVAR F KGFPQP I LS EDGSR I R YGGRD YSIiDE 

FEANKILHOHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
NIS\RYYLRVIIWKTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\UDDYLGFPRTLTCRHTI 
HFLQ KSPGGNC / RG LD M I P DL KAMN PLKAKTAS L FEQKS M KGW W 
PCYAEKIDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLIjIIiI* 
LFVAVLLYSLPNYIiSMKIVKPKV 


6022 
1" 6023 " 


4953 


549 


EAIQFE VS IGNYGNKFDTTCKPJLAS TTQYSRAVFDGNYYYYLPW ' ' 
AHTKPVVTLTSYWEDISHRLDAVNTL1AMAERLQTNIEALKSGI 
QGKIPANQLAELWLKL1DEVIEDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDI 1 1 WMI RGEKRI4AYAR I PAHQ VL YS TS GENASGKY C 
GKTQTI FLKYPQEKNNGP KVPVELRVNI WLGLSAVEKKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGD WKPAEDTYTDANGD KAAS PS ELTC P PG WE WEDDAWS YD I NR 
AVDE KGWE YG I T I P PDHKPKS WVAAEKMYHTHRRR RLVRKRKKD 
bTQTAS STAGAMEELQDQEG WE YAS L I G WKFH WKQRS S DTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTPIVSCNFDRDYIYHLRCYVYQARNL1ALDKDSFSDP 
YAHI CFLHRSKTTE I IKS TLNPTWDQT 1 1 FD EVE I YG E PQT VIjQ 
KPPKVIMEDFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAEIiILRGKDGSNLPILPPQRAPNLYMVP 
QGI R P WQIjTAI E I LAWGIjRNMKNFQMAS I TS PS LWECGGER V 
ESWIXWLKKTPNFPSSVLFMXVFLPKEELYMPPLVIKVIDHRQ 

kgrkpwgqctierldrfrcdpyagkedivpqlkasllsappcr 

DIVIEMEDTKPLIASKCLSSMSTALSKMASPATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 

dfsdtfklyrgksdenedpswgefkqsfriyplpddosvpapp 

RQFRELP DS VPQE CTVR 1 Y I VRGDELQ PQDNNGLCDP YI K I TLG 
KKVIE\DRDHYIPNTLNPVFGRMYEL»SC"YT,PnFTmT vTcwnvn 
TFTRDEKVGETIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDS LR \ PTQ \I1I1QNVARFKG FPQ P ILSEDGSRIR YGGRD YSLDE 

feankilhqhlgapeerlalhilrtqglvpehvetrtlhstfqp 

N I S \R Y YLR VI I WNTKDVILDEKS ITGEEMSDI YVKG W I PGNEE 

nkqktdvhyrsldgegnfnwrfvfpfdylpaeqlcivakkehfw 
sidqtefripprXliiqiwXdndkfsXlddylgfprtltcrhti 
hflqks pggnc/rgldmi pdlkamnplkaktas lfeqksmkgww 
pcyaekdgarwagkvemtleii^keaderpagkgrdepnmnp 

KLDJjPNRPETSFLWFTNPCKTMKFIVWRRPKWVIIGLLFLLILL 

lfvavllyslpnylsmkivkpnv 


6024 


102 


916 

j 


S^Ki^MFVEIi^I/LNTTPDRAEQGKIjTLLC'DAXTDGS FL VHHFL 
S F YLKANCKVCFVAIi I QS FSHYS I VGQKLGVSLTMARERGQLVF 
LEGL/IVCSGR\VFQAQKEPHPLQFLREANAGNLKPLFEFVREA 
LKPVDSGEARWTYPVLbVDDLSVLLSLGMGAVAVIiDFIHYCRAT 
VCWELKGNMVVLVHDSGDAEDEENDILIiNGLSHQSHLlLRAEGIi 
VTGFCRDVHGQLR I LWRRPSQPAVHRDQSFTYQ YKIQDKS VS FF 
1UCGMSPAVL 




3 


3260 ] 

] 
I 
] 
1 


?I*S FLCYPRFKCUj f CLQFAI PASRMEQLNE lellme ks FWE EAE 
j PAEIjFQ KKVVASFPRT vlstgmdnr ylvi»avntvqnke gnce k 
ILVITASQSLENKELCILRNDWCSVPVEPGDriHLEGDCTSDTW 
C I DKD FG YIj I L YPDML I SGTS IAS S IRCMRRAVL5 ETFR S S DPA 

^Q^IGTVLHEVFQKAINNSFAPEKLQELAFQTIQEIRHLKEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

j amino acid 

1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Amino acid segment containing signal peptia*e""~ 
{AsAlanine, OCystcine, D=Aspartic Acid, E* 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
Hoiiistidine, I=Isoleucine, K=hysLne. t 
L^Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=*Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possiblo nucleotide deletion, 
_ \=possibls nucleotide insertion) 








YRLNLSQDE I KQEVEDYLPS fckwagdfmhkntstdfpqmqlsl 
PSDNS KDNS TCNIE WXPMD I EES I WS PRFGLKGK.I DVTVGVKI 
HRGYICTKYK I MPLELKTGKESNS I EHRSQWLYTLLSQERRADP 

eaglllylktgqmypvpanhldkrellklrkqmafslphrisks 

ATRQ KTQLASL PQ 1 1 E EEKTCKYCS QIGNCAL YSRAVEQQMD CS 
S VP I VMLP K I E EETQHLKQTHLE YPS LWCLMLTLE S QS KDN KKN 
HQNIWLMPASEMEKSGSCIGNLIRMEHVKIVCDGQYLHNFQCKH 
GAIPVTNLMAGDRVIVSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLSVLPESTLFRLDQEEKNCDIDTPLGNLSKLMENTFVSKKLR 
DLI IDFREPQFISYLSSVLPHDAKDTVACILKGLNTCDm?r>AMirv 
VLL S KD YTL I VGMPGTG KTTT I CTL VRI L YACGFS VLLTS YTHS 
AVDNILLKLAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 
KS\LALLEELYTSQL1DATTCMGINHPIFSRKIFDFCIVDEASQ 
I S Q P I CLGPL FFS RRF VLVGDHQQLP PLVLNREARALGMS ESLF 
KRLEQ^SAWQLTVQYRMNSKIMSLSNKIiTYEGKLECGSDKVA 
NAVINLRHFKDVKLELEFYADYSDNPWIJ«GVFEPNNPVCFLNTD 
j K VPAPEQ VEKGG VSN VTEAKLI VFLTS I FVKAGC S PS D I G 1 1 AP 
YRQQLKI INDULARS IGMVE VNT VDKYQD\RDKS I VLVS F VRSN 
KDGTVGELLKDWRRT iNVAI TRAKHKL I LLGCVPSLNCYPPLEKL 
LNHLNSEKLI IDIiPSREHESLCHILGDFQRE 


6025 
6026 r 


I 3977 ' 


89 

S14 1 


^FPAQSDHLPPVFPLRSDLLITMSTLWSPHPDAFPSLRALIA 
ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 

GLWVWGATAVAQLEjWPAG lggpggs raavlvqq wvs yadteli p 

AACGATLPALGLRSS AQDP QAVLGALGRALS PLE E WLRLHT YLA 
GBAPTIJ^IJ^VTAIjLLPFRY^DPPARRIWNNVTRWFVTCVRQ 

pefravlgewlysgarplshqpgpeapalpktaaqlkkeakkr 

EKLBKFQQKQKIQQQQPPPGEKKPKPEXREKRDPGVITYDLPTP 
PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFKPEYGRPNVS 
AANPRG VFMMC I P P PNVTGS LHLGHALTNA IQDS LTR WHRMRGE 
TrLWNPGCDHAGIATQVWEKKLWREQGIiSRHQLGREAFLQEVW 
KMKEEKGDRIYHQLFCKLGSSLDWDRACFTMDPKI^AAVTEAFVR 
LHEEG 1 I YRSTRLVNWS CTLNSAISD I E VDKKELTGRTLLS V 3 G 
YKEKVE FG VLVS FAYKVQGS DS DEE WVATTR I E TM LGDVAVAV 
HPKDTRYQHLKGKNV1HPFLSRSLPIVFDEFVDMDFGTCAVKIT 
PAHDQNDYEVGQRHGLEAISIMDSRGALINVPPPFLGLPRFEAR 1 
KAVLVALKERGLFRG I EDNPM WPLCNRS KD WE PLLRPQWYVR 
CGEMAQAASAAVTRGDLRI I» PERHQRTWHAWMDN I RE \ WCMFPG 
KLWWG \HR\ I PAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 
KAAKEFGVSPDKISLQQDEDVLDTWFSSGLFPLSILGWPNQSED 
LSVFYPGTLLETGHDII^FWVARMVMLGLKLTCRLPFREVYLHA 
IVRDAHGRKMSKSLGNVIDPLDVIYGISLQGLHNQLLNSNLDPS 
EVEKAKEGQKADFPAGIPECGTDALRFGIiCAYMSQGRDINLDVN 
R I LGYRH FCNKLWNATKFALRGLGKGF VPS PTSQPGGHESL VDR 
WI RSRL TEAVRLSNQG FQA YD FPAVTTAQ YS FWL YELCD VYLEC 
LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 
RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALEIAIiSlTRA 
VRP \ LRAD YNLHPES G PTCFLE VAD\ EATGALASAVSG YVQG PG 
QAQVWAVAEPWGLPAP\QGCAVAIASDRCSI\HLQLQG\LLDP 
ARELG\KLQ\AKRVEAQ\ROAQ\RLR\ERRA\ASGNPVKVPL\E 
VQEADEAKLQQTEAEL.RKVDEAIALFQKML 




2674 


: 

3 
£ 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAVDKKVDCPRLC " 

XCEIRPWFTPRSIYMEASTVDCNDLGLIiTFPARIiPANTQILIiIiQ 

HflNIAKIE YS TDFP VNLTGLDLSQNNLS S VTMINGKKMPQLL S V 

ifLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGIjHW 

jLRLHIiNSNRLQMINSKWFDALPNLEILMIGENPIIRIKDMNFK 

PIjINLiRSLVIAGINLTEI PDNALVGLENLES I S F YDNRL I KVPH 

/7^QK\A^NLKFIjDLNKWPINRIRRGDFSNMLHLKELGINNMPEL 

:SIDSLAVDNLPDLRKIBATNNPRLSYIHPNAPFRLPKLESI>IL 

JSNALSALYHGTIESLPNLKEISIHSNPIRCDCVIRWMNP'INKTN 

:rfmepdslfcvdppefc^qnvrqvhfrdmmeiclplxapesfp 

JNLNVEAGS YVS FHCRATA\EPQPE I YW I TPSGQKLLPNT \ LTD 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



SEQ 
ID 
NO: 



Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6027 



5254 



4148 



6028 



120 



Amino acid, segment containing signal peptide 
(A=Aianine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S»serine, T=Threonine, V= Valine, 
W»Tryptophan, Y^Tyrosine , X= Unknown, **Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide ins ertion) 

KFYVHSEQTLDINGVTPKE GGLYTClATNl,VGAi)LKSVMIKVDG 
SFPQDNNGSLNIKIRDIQANSVLVSWKASSKILKSSVKWTAPVK 
TENSHAAQSARIPSDVKVYNLTHLNPSTEYKICIDIPTIYQICNR 
KKCVNVTTKGLHPDQKEYEKNNTTTLMACIiGGLLGI IGVICLIS 

CLSPEMNCDGGHSYVRNYLQKPTFALGELYPPLINLWEAGKEKS 
TSLKVKATVIGIiPTNMS 

GGRRAPGR PGRS I KDEEEETVFRE WS FS PDPDPVRYYDKDTTK 
PISFYLSSLEEIiLAWKPRLEIDGFNVALEPIiACRQpPLSSQRPRT 
LLCHDMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TIPPVGWTNTAHRHGVCVLGTFITEWNEGGRLCEAFtiAGDERSY 
QAVADR L VQ I T \ R F FRFDG WL I NI ENS 1»S LAAVGNM PPFLR YLT 
TOLHROVPGRr.UT.wvncTn/ricnrvr vr.:An»T 



3432 



3533 



««uc«i>ii>ovj**iaiiic*CHJj v X Vli VJJ VFARGNWGGRFDT 

DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
DPVALRNRCPAPAKLCPH U 

^UliL»bgAKGFHGEIEDLQQWLTDT BRHLLASKPLUGLPETAKEQ 
LNVHMEVCAAFEAKEETYKSIjMQKGQQMLARCPKSAETNIDQDI 
NNLKEKWESVETKLNER\KT\KLEEALNIA\MEFHNSL\QDFIN 
WLTQAEQTIiNVASRPSLILDTVLFQIDEHKVFANEVNSHREQI I 
ELDKTGTHLKYFSQKQDWLIKNLLISVQSRWEKWQRLVERGR 
SLDDARKRAKQFHBAWS KLMEWLEESEKSLDS EU3 IANDPDKI K 
TQIAQHKEFQKS LGAKHS VYDTTNRTGRS LKEKTSLADDNLKLD 
DMLSELRDKWDTICGKSVERQNKLEEA\LIiFSGQFTDALQALID 
WLYRVEPQIiAEDQP VHGDI DLVMNLI DNHKAFQKELGKRTSSVQ 
ALKRSARBLIEGSRDDSS WVKVQMQELSTRWETVCALS I SKQTR 
LEAAI^QAEEFHSVVHAIiLEWbAEAEQTLRFHGVLPDDEDALRT 
h I DQH KE FMKKLE EKRAE LNKATTMGDTVLAI CHPDS I TT I KH W 
3 T 1 1 RARFEE VIjAWAKQHQQRIiAS ALAGL I AKQELLEALLAWLO 
WAETTLTDKDKEVIPQEIEEVKAI.IAEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHIPVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 
NFDFDIWRKKYMRWMNHKKSRVMDFFRRIDKDQDGKITRQEFID 
GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPmCDA 
YKPITDADKIEDEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFLVKNDPCRAKGRT 
NKELREKFIIiADGASQGMAAFRPRGRRSRPS SRGAS PNRSTS VS 
SQAAQAAS PQVPATTTPK 1 LH PLTRN YGKP WLTNS KMST P CKAA 
ECSDFPVPSAEGTPIQGSKLRiiPGYLSGKGFHSGEDSGLITTAA 
ARVRTQ FADS KKTPS R PG S RAGS KAGSRASSRRGS DASDFD I S E 

IQSVCSDVETVPQTHRPTPRAGSRPSTAECPSKIPTPQRKSPASK 
LDKSSKR 

| IMPCGSSRLLHGCWTHPNEPVSDLSYFDCIKSVMENSICVI.GESM" 
AGISQNAKTGDLPAFGECVGIASKALCGLTEAAAQAAYLVGIFD 
PNSQAGHQGL VDP I QFARANQAI QMACQNLVDPGS S PS QVLSAA 
T I VAKHTS ALCNACRI AS S KTANPVAKRHF VQSAKE VANSTANL 
| VKTIKALDGDFSEDNRNKCR I ATAPL IEAVENLTAFASNPEFVS 
IPAQISSEGSQAQBPILVSAXPMLESSSYL1RTARSLAINPKDP 
PTWSVIAGHSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 
IRDIEQASLAAVSQSIATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARG EAAQLGHKGTQLAS YFEPL I LAAVG VAS K ILDHOOOM 
TVLDQTKTLAfiSALQML YAAKEGGGN PKAQHTHDAI TEAAQIiMK 
EAVDDI M VTLNEAAS E VGLVGGMVDA I AEAMSKLDEGTPPEPKG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGG1ASQMTSD 
, YGHLAFQGQMAAATAE PEE I GFQ I RTR VQDLGHG C I FLVQKAG\ 
I ALQVCPTDS YTKRBLI ECARAVTEKVS L VLSALQAGNKGTQACI 
TAATAVSGIIADLDTTIMFATAGTLNABNSETFADHRENILKTA 
KALVEDTKLL VSGAAS TPDKLAQAAQSS AATI TQLAE WKLGAA 

SLGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTSLLKTVKAVEDEATRGTRALEATIECIKQ 
ELTVFQSKDVPEKTSS PEES 3 RMTKGITMATAKAVAAGNS CRQB 
P VIATANLS RKAVSDMIiTACKQAS FHPDVSDEVRTRAIiRFGTEC 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=» Phenyl alanine, G«Glycine, 
H«Histidine, I»Ieoleucine, K=Lysine, 
L«Leucine, M»Methionine , N=Asparagine, j 
P*Proline, Q=Glutamine, R^Arginine, 
S«Serine, T=Threonine, V s Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown / *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) | 








TLGYLDLLEHVLVILQKPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPEDPTVI AETE LLGAAAS I EAAAKKDEQLKPRAK 
PKQADE TLDFEEQI LEAAKS I AAATS AIi VKSASAAQRELVAQG K I 
VGS I PANAADDGQWSQGLI SAARMVAAATSSLCEAANASVQGHA 
S EE KLI S S AKQ VAAS TAQLLVAC KVKADQDSE AMRRLQAAGNAV 
KRASDNLVRAAQKAAFGXADDDDVWKTKFVGGIAQIIAAQEBM 
LKK3RELE EARKKLAQI RQQQYKFLPTELREDEG | 


6030 


3 


j 1777 


FPGRGSPALQLEVLI CLGLMGLERALNVIiAPI FYRNI VNLLTEN 
APWNSLAWTVTSYVFLKFIjQGGGTGSTGFVSNLRTFLWIRVQQF 
TS R RVELL I FS HLHELS LRWHI*GRRTGE VLR I ADRGT SS VTGL 1» 
SYLVF^IPTLADIIIGIIYFSMFFWAWFf:T/rVtfrY'*MCT vr t*tt 1 

IWTEWRTKFRRAMNTQENATRARAVDSLLNFETVKYYNAESYE 
VERYREAI IKYQGLEWKSSASLVLLNQTQNLVIGLGLLAGSLLC 
AYFVTEQKLQVGDYVLFGTYIIQLYMPLNWFGTYYRMIQTNFID 
MENMFDLLKK\STEVKDLPGAGPFRFQKGRIEFENVHFSYADGR 
BTLQDVSFTVMPGQTLAIiVGPSGAGKSTILRLIiFRFYDISSGCI 
R I DGQD I SQVTQAL FRFSHWE IiC PKDT VL FNDT I AON I RYGRVT 
AGNDEVEAAAQAAGIHDAIMAFPEGYRTQVGERGLKLSGGEKQR 
VAIARTILKAPGIILLDEATSAIoDTSNERAIQASLAKVCANRTT 
I WAHRLSTWNADQIIiVI KDGC I VERGRHEALLS RGGVYADMW 
QLQQGQEETS EDTKPQTME R 


6031 


160 


1694 


LRMSENLDKSNVNEAGKSKSlSIDSEEGIiEDAVEGADEALQKAIKS | 

UL/u^ruiw \ld irnoojfJrrtr v l VcuStlMJSt InKwV JTWMAIjAHEI WNG 1 

DFQI KP VELPENSLKKRVKE I VHKAFWDCLS VQLS EDPPAYDHA 

IKLVGEIKBTLLSFLLPGHTRLRNQITEVLDLDIiIKQEAENGAL 

DIS KLAEFI IGMMGTLCAPARDEE VKKLKD I KE IVPLFRE IKS V J 

LDLMKVDMANFAISSIRPHLMQQSVEYERKKFQEILERQPWSLD 

FVTQWLEEASEDLMTQKYKHAIjPVGGMAAGSGDMPRCSPVAVQN 

YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQLTILGAV 

LLVTFSMAAPGISSQADFAEKLKMIVKILLTDMHLPSFHLKDVL 

TTI GEKVCLEVSS CLSLCGS S P FTTDKETVUCGQ I QAVAS PDDP 

IRRIMESRIIiTFLETYItASGHQKPl/PTVPGGLSPVQRELBEVAI 

KFARLVNYNKMVFCPYYDAILSKILVRS | 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISE"' | 
SCDRIKEEFQFLQAQYHSIiKbECEKLASEKTEMQRHYVMYYEMS 
YGLN I EM HKQAE I VKRLNAI CAQVI p FLS QEHQQQ WQAVERAK 
QVTMAELNA 1 1 GQQQLQAQHhS HGHGLP V PLTPHPS GLQP PA I P 
PIGS S AGLLALS S ALGG QS HI*P I KDEKKHHDNDIIQRDRDS I KSS 
SVSPSASFRGAEKHRKSADYSSESKKQKTEEKEIAARYDSDGEK 
SDDNLWDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP I 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLR PVPGKPPGVDPLASStjRTPMAVPCPYPTP FGI VPHAG 1 
MNGELTS PGAAYAGIiHN I S PQMSAAAAAAAAAAAYGRS P WGFD 
PHHHMRVPAI PPNLTGIPGGKPAYSFHVSADGQMQPVPFPPDAL 
IGPG I PRHARQ INTLNHGE WCAVTISNPTRHVYTGGKGCVKVW 
D ISH PGNKS PVS QLDCIiNRDN Y I RS CRLLPDGRTL I VGGEAS TL 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFO<^TDGASCIDISNDGTKLWTGGLDNTVRS 
W \ DLREGRQLQQHD/ FFTS PVFS LG YCP \ TEE WLAVGMENSN\ V 
EVIiHVTKPDKYQIiHLHESCVIiSI*KFAHCGKWF\VSTGKDNLLNA 

W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY | 


6033 


39 


24 IS 

: 


AARIiCRAQPTKSAWMIRDI^KMYPO^nmPAPHQPAQPFKFTISEH 
SOTRIKEEFQFLQAQYHSLKIiECEKIiASEKTEMQRHYVMYYEMS 
YG LN I EMHKQAE I VKRLNA I CAQ V I P FL S QEHQQQ WQAVERAK 

QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
P I GSS AG LLALS SALGGQS HL P I KDE KKHHDNDHQRDRDS I KS S 
S VS PSAS FRGAEKHRNSADYS S E SKKQKTEE K3 1 AAR YDS DGE K 
5DDNLWDVSNEDPS S P RGSPAHS PRENGLDKTRLL KKDAP IS P 
^SIASSSSTPSSKSKELSL^IEKSTTPVSKSNTPTPRTDAPTPGS 
NISTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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ID 

NO: 



6034 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
: location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



2683 



603S 



404 



6036 



1745 



6037 



2936 



1919 



6036 



1450 



426 



6039 



4073 



1000 



"Amino acid segment containing signal peptide" 
(A=Alanine, (^Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidinc, I«Isoleucine , K=Lysine, 
L^-Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T«Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
MNG BLTS PGAA Y AGLHN I S PQM S AAAA AAAAAAA YGRS P WG FD 
PHHIlMRVPAIPPNIiTGIPGGKPAYSFHVSADGQMQPVPFPPDAIi 
IGPG I P RHARQ I NTkNHGE WCAVTI SNP TRHVYTGG KGCVKVW 
DI SH PGNKS P VS QLDCLNRDN Y I RSCRL LPDGRTL I VGGE ASTL 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNr 
AVWDLHNQTLVRQFQGHTDGAS CIDISNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/ FFTS pvfslgycp \TEEWLAVGMENSN \ V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGIGVF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSK3SSS\VLSCDI\SVDDKYIVTGS\GDK\ 



E SGRR RR LKKKKS P C PGTAGG PGET N PG PGACPRG PREB AAAAM 
EIAPQEAPPVPGADGDIEEAPAEAGSPSPASPPADGRLKAAAKR 
VT FPSDED I VSGAVE PKD PWRHAQNVTVDE VI GAYKQACQKLNC 
RQ I PKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLEOTNLDEDGASALFDM I EY YESATHLN I SFNKH IGT 
RG WQAAAHMMRKTS CLQ YL\DARNT PLLDHSAP FVARALR IRS S 
IAVLHLENASIjSGRPLMLLATAIiKMNKNLREI,YIi\ADNKIjNGLQ 
DS AQLGNLL KFNCS LQ I LDLRNNH VLD SGIAY I CEGLKEQRKGL 

vtlNvlwnnqlthtgmaflgmti^phtqsletlnlghnpignegv 
rhlknglisnrsvlrlglastkltcegavavaefiaesprllrl 
dlreneiktgglmai^ialkvnhsllrijdldrepkkeavksfie 
tqkallaeiqngckrnlvlarereekeqppqlsasmpettatep 
qpddepaagvqngapspapspdsdsdsdsdgeeeeeeegerbet 
psgaidtrdtgssepqpppepprsgpplpnglkpefalauppep 

PPGPEVKGGSCGLEHELSCSKNEKELEEL IiLEASQESGQBTL 

SVTYT .nT TTiWV MT(7B t t»\ r<-> r -r 



S VTYLG 1 1 LH KNTGAL PAD P VQL I SQ TP T P ST KflQItLS FLGM VG 
YF YLWIPGFAILTKPLCKIjTKENIiADAIDP KSFSHS S FRS LKTA 
LENASTIiAI*PDSSQPF\SLHTAEVQGCVVEH,TQGLGPLPV 
IiPDVEKLGRRRGRKMDSVEKGAATSVS NPRGRPSRGRPPKLQRN 
SRGGQGRGVEKPPHIjAALIIiARGGSKGIPIiKNIKHIiAGVPLIGW 
VLRAALDSGAFQSVWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
S STSI*DA 1 1 E FliN YHNEVD I VGN I Q ATS PCJjHPTDLQKVAEMI R 
E EG YD S VFS WRRHQ FRWS E I QKGVREVTE PLNLNPAKRP RRQD 
WDGELYENGS FYFAKRHL I EMG YLQGG KMA Y YEMRAEHS VD IDV 
DIDWPIAEQRVLRYGYFGKEKLKEIKLLVCNIDGCLTNGHIYVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVSVSDKLAWDEWRKEMGLCWKEVAYLGNEVSDEECLK 
RVG LS G APADACSTAQKAVG Y I CKCNGGRGA\ I REFAEH I C \ LL 
MEKGLINFMPKNRKIjAVNIGEKK 



WTSWWM5SVIiTIIjLFSLQGNKMLNYSA PSAGGYLLPRKPVGTPA ~ 

GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 

SFSEGGERLLPTQKQPGGGQVNSSRYKT\ELCRPFEENGACKYG 

DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 

HNAEE RRALAGARDLS ADRPRLQHS FS FAGFPS AAATAAATGLL 

DSPTSITPPPILSADDLLGSPTLPDGTNNPF\AFSSQELASL^A 

psmolpgggspttflfrpmsesphmfdsppspqdslsdqegy£s 
ssssshsgsdsptldnsrrlpifsrlsisdd 



— _ ^„ ^ Jn , Ajur A c ji ma x z>uu 

ssalqefgtrjnhtfgvplph rrkqiiscnicolrfnsdsqaaai T 
ykgtfchakklkaleamknkqksvtakdsakttftsittntints 
sdktdgtagtpaisttttveirkssvmtteitskvekspttatg 
nsscpsteteebkakrll\ycslckvavnsasqleahnsgtkhk 
tmlearngsgtikafpragvkgkgpvnkgntglqnktfhceicd 

VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFS KEPSKPLAPRlLPWPIiAAAAAAAAVAVSS PFSLRTAP 
AATLFQTS AIjPPAIiLRPAPGP IRTAHTP VLF APY 
LDEYEARLTLAKLDDFEEDNEDDDENRVWQESKAA KITEIilNKIi 

nfldeaekdlatvwsnpfddpdaaelnpfgdpdseepitetasp 

RKTEDSFYNNSYNPFKEVQTPQYIiNPFDEPEAFVTIKDSPPQST 
KKKNIRPVDMSKYLYADSSKTEEEBLDESNPFYEPKSTPPPNNL 
VNPVQBLETERRVKRKAPAPPVLS PKTGVLNENTVSAGKDLSTS 
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SEQ 
ID 
NO: 



6040 



6041 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



475 



1052 



6042 



1306 



Xmino acia segment containing signal peptide 
(A^Alanme, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hi s tidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q^Glutamine, R=Arginine, 
SaSerine, TaThreorune, V= Valine, 
W=Tryptophan, Y^Tyrosine, X -Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\°possible nucleotide ins ertion) 

PKPSPIPSyVLCRKPNA S^LIiVWCKEVrKNYRGVKITNFTTSW -" 
RNGIiS FCA I LHH FR PDL I D YKSLN PQD I KENNKKA YDGFAS IG I 
S RLL EPSDM VL LAI PDKLTVMT YL YQ IRAHFSGQELNWQ I EENT 
SSKSTYKVGNYETDTNSSVDQBKPYAELSDLKREPELQQPISGA 
VDFLSQDDSVPVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
SDTEPQKSQQSSGRTSGSDDPGICSMTDSTQAQVLLGKKRLLKA 
ETLELSDI/YVSDKKKDMSPPFICEETDEQJCLQTIiDIGSNLEKEK 
LENS RSLECRSD PES P I KKTS LS PTS KLGYS YS RDLDliAKKKHA 

SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVI. 
LEQARRDAAL2CAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQL I AEARS GGKMS EL PS YGERAAE KLKERS KASGDENDNI E I 
DTNE E I PEG FWGGGDELTNLENDL DTPE QNS KL VDLKL KKLLE 
VQPQVANSPSSAAQKAVTESSEQDMKSC3TEDLRTERLQKTTERF 
RKPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAAI TETQRKP SEDE VLNKG FKDS \SQYVVGELAALENEQKQ 
IDTRAALVEKRLRYIJ^TGRNTEEEEAMMQEWFMLVNKKNALIR 
RKNQLSIiLEKEHDLERRYELLNRELRAMLAI EDWQKTEAQKRRE 
QLLLDEZ>VALVNKRDALVkDI J3AQEKQAEEBDEHLERTLEONKG 
KMAKKEEKCVLQ 

^x^rrAPSCAFPVQFRQPSVS GLSQITKSLYISNGVAANNKLM 
LSSNQ I TMVINVS VEWNTLYEDIQYMQVPVADSPNSRLCDFFD 
PIADHIHSVEMKC<3R\TLLHCAAGVSRSAALCLAYLMKYHAMSL 
LDAKTWTKS CRP I IRPNSGFWEQL IHYEFQLFGKNT VHMVSS PV 
■ GMIPDIYEKEVRLM IPL 

3886 TEKDEKTAHNLENVLIHF WERLSEICVAKISKPEADVESVLGVS - 
I WLLCVLQKPKGSLKSSKKKNGKVRPADEILESNKENEKCVSSEG 
EKIECWELTTEPS LTHNSSGLLSPLRKXPLEDLVCKLADIS INY 
VNERKSEQHLRFLSTLLDSFSSSRVFKMLLGDEKQSIVQAKPLE 
I AKL VQ KNPAVQFL YQKL IG WLNEDQRKD FGFL VP I L YS ALRCC 

DNDMERKKVLDDLTKVDLKWNSIJjKIIEKACPSSDKHALVTPWL 
KGDILGEKLVNLADCLCNEDLESRVSSESHFSERWTLLSLVLSQ 
HVKNDYIilGDVYVERI I VRItRETLtFKTKKLSE&BSSDSSVS FIC 
DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 
CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLICNQVQASSL 
DXNSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSEWEK 
MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSKLCT 
S ALLSKM VL I ALRKE T VLENNEL EKI IAELL YS LQWCEELDNP P 
I FL I GFCE I LQKMN I T YDNLRVLGNMSGLLQLLFNR SREHGTLW 
SLIIAKLILSRSISSDEVKPHYKRKESFFPLTEGNLHTIQSLCP 
FL S KEEK KEFS AQC I P ALLG W TKKDLCS TNGG FGHLA I FNS CLQ 

TKSIDDGELLHGILKIIISWKKEHEDIFLFSCNLSEASPEVLGV 
N I E 1 1 RFLSLFLKYCS S PLAESE WD FIMCS MLAWLE TTS ENQAL 
YS I PL VQL FAC VS CDLACDLS AFFDS TTLDTI GNLP VNL I SE WK 
E FFS QG I HSLLL P I LVT VTGENKDVS ETS FQNAMLKPMCSTL TY 
ISKEQLLSHKLPARLVADQKTNLPEYLQTLLNTLAPLLLFRARP 
VQIAVYHMLYKLMPELPQYDQDNLKSYGDEEEEPALSPPAALMS 
LLSIQEDLLENVLGCIPVGQIVTIKPLSEDFCYVLGYLLTWKLI 
LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPENPTYAE 
TAVEVPNKDPKTFFTEELQLSIRETTMLPYHIPHLACSVYttMTL 
KDLPAMVRLWWNS S E KRVFNI VDRFTS KY VS S VLS FOE I SS VOT 

STQLFNGMTVKARATTREVMATYTIEDIVIELIIQDPSNYPLGS 
IIVESGKRVGVAVQQWRNWMLQLSTYLTHQNGSIMEGLALWKNN 
VDKRFEGVEDCMICFSVIHGFNYSLPKKACRTCKKKFHSAVCLY 
KWFTSSNKSTCSLCRETFF X 

MAEIAPASPSDlKASVSN GDTTLLCSRRUSOGMNEVRQVSLTYP 
GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
GAQ RAPGGLS Y PAAS PTPHAAFLADP VS NMAMAYGS SLAAQGKE 
L VDKN I DRFI P I TKLKY YFAVDTM YVGRKLGLL FFP YLHQDWE V 
QYQQDTPVAPRFDVNAPDLYIPAMAFITYVLVAGLALGTQDRFS 
PDLLGLQASSAliAWLTLEVLAlLLSLYLVTVNTDLTTIDLVAFIj 
G YKYVGM I GGVLMGLLFGKI G YYLVLGWCCVA I FVFMIRTLRLK 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, C«Cyeteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HcHistidine, I=*Isoleucine, K«= Lysine, 
Ii=*Leucine, M«Methionine, N=Asparagine, 
P»Proline, Q=Glutaraine, R=Axginine, 

— ocx. a.xLc: , j-ixiteonine, vj=»vaxxne, 
W=Tryptophan, Y-Tyrosine, X-Unknovm, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6043 


403 


599 


j. LADAAAEG VP VRGARNQLRM YLTMAVAAAQ PMLM YWLTFHLVR " 

LCLFFFFPCATPVLPLPSLISAl,/CLSHt>SVSSU/FCPCQPPLPC 
PLP PLQNKTAKGS LSTEQS ERG 


6044 


793 


4X2 


KLEMWNFTl.ISKVKISREVTMIASKFGIGQQVRHSLLGYtiGVVV 
DIDPVYSLSEPS PDELAVNDELRAAP W YHWMEDDNGLP VHTYL 
AEAQLSSELQDBHP\EQPSMDELAQTIRKQLQAPRLRN 


6045 


155 


2299 


SPLPQVAAMN YLRRRLS DSNFI4ANLPNGYMTDLQRPQ PP PPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFS EQ VGGGSGGAGRGGAASRVLLVI DEPHT 
DWAKYFKGKKIHGEIDI ICVEQAEFS DLNLVAHANGG FS VDMBVh 
RNG VKWRS LXPDFVL I RQHAFSMARNGDYRSLVIGLQYAGI ps 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
EMLS S \TTYP VWKMGHGTLWGWGKVKVDNQHDFQDIAS WALT 
KTYATAEP FI DAKYDVRVQKI GQN YKAYMRTS VSGNWKTNTGSA 
MLEQIAMSDRYKLWVDTCSEI FGGLDICAVEALHGKDGRDHI IE 
WGSSMPLIGDHQDEDKQLIVELWNKMAQALPRQRORDASPGR 
GSHGQTPS PGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPG P 
QROG P PLQQR P P POGQQHLSGLG P PAGS PLPQRLPS PTS APQQ P 
ASQAAPPTQGQGRQSRPVAGGPGAPPAARPPASPSPQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVPRTG PPTTQQPRPSGPGPAGRPKPQLAQKPSQDVP P PATA 

AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRWAGPESLPPLPR 
SLIMDSPRAGTHQGPLDAETEVGADRCTSTAYQ2QRPQVEQVGK 
QAPLS PGLPAMGGPGPGP CEDPAGAGGAGAGGS E PLVT VT VQCA 
PTVALRARRGADLSSLRALLGQALPHQNAQLGQLSYLAPGEDGH 
WVPIPESESLQRAWQDAAACPRGIiQLQCRGAGGRPVLYQVVAQH 
SYSAQG PEDLG FROGDT VDVLCE VDQAWLEGHCDGR IG I FP KCF 
WPAG PRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


P VLVTS LRMRE ADTLRP PQLME VS AD 1 1 ST VE FNHTG ELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDBEGKLKDLSTVTS LQVP VLKPMDLMVEVS PRR I FANGHTYH 
INS I S VNS DCET YMSADDLRI NLWHLA I TDRS FTP \ NI VD I KPA 
NMEDLTE VI TAS E FHPHHCNLFVYS SS KGSLRLCDM RAAALCDK 
HSKLFEEPEDPSNRSFFSE I IS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL \ NME AR P IETYQVHDy LRS KLCSLY END CI FDKFE CA 
WNGSDS V I MTGA\ YNNFFRM FDRNTKRD VTL \EAS RES S KPRAV 

LKPRRVCVGGKRRRDDISVDSLDFTKKILHTAWHPAENI1AIAA 
TNNLYI FQDKVNSDMH 


6048 


1 


3194 

i 
I 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 

KGTSNSS KTRAGANSKGRRGSQNSS E HR P PASS TS EDVKAS PS S 

ANKR KNK PLSDME LNSS S EDS KGS KR VRTNSMGS ATGPL PGTKV 

EPTVLDRNCPSPVLIDCPHPNCNKKYKHINGLKYHQAHAHTDDD 

SKPEADGDSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 

PKVRLVEPHSPSPSSKFSTKGLCKKKLSGEGDTDLGALSNDGSD 

DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 

SARPI /APLAI PPQQI YTFQTATFTAAS PGSSSGLTATVAQAMP 

NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 

LESPLTPGKVCRAEEGKSPFRESSGNGMKMBGIiLNGSSDPHQSR 

LASIKAEADKIYSFTDNAPSPSIGGSSRLENTTPTQPLTPLHW 

TQNGAE AS S VKTNS PAYS DI SDAGEDGEGKVDSVKS KDAEQLVK 

EGAKKTLFPPQPQSKDSPYYQGFESYYSPSYAQSSPGALNPSSQ 

AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 

IQQRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHLLSTNTAYRQQ 

SrEEQQKRQSLEQQQRGVDKKAEMGLKEREAALKEEWKQKPSIPP 

rLTKAPSLTDLVKSGPGKAKEPGADPAKSVUPKLDDSSKLPGQ 

\P EGLKVKLSDASHLSKEAS EAKTGAE CGRQAEMD P I L WYRQEA 

SPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSiCDSVPK 
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SEQ 
ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



215 



6050 



"605T 



S052 



566 



predicted end 
nucleotide 
location 
corresponding 
to first 
amino aci^d 
residue of 
amino acid 
sequence 



1089 



'"1718 ■ 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=* Phenyl alanine, G=Glycine, 
H»Histidine, I=Ieoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R^Arginine, 
S=Serine, T=* Threonine, V=v a line, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
EDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY 

I P YMHO Yfl V<?nS VD DMHD O VD O MTt^m/rMAv...^ ! _ 



1718 



6053 



201 



-?054- 



1704 



losr 



„ — ^ * „ uwvl vuir i. «c.iic»jKjLjv*oivcfKfa VttVPVSSPIiTQHQS Y 

I P YMHG YS YSQS YDPNHPS YRSMPAVMMQNYPGS YhPSSYS FS P 
YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 
SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 
|PSQRLMSTHHHHHHIjGYSLLPAQYKLPYAAGI.SSTAIVASQQG 

AMTUN^FUKRVPS IRSGDFQAPFQTSA AMHHPSQESPTIjPESSAT 

dsdyysptggaphgycsptsasyg\kai,npyqyqyhgvngsags 
ypakayadysyassyhqyggaynrvpsatnqpekevtepevrmv 
ngkpkkvrkprtiyssfqlaalqrrfqktqylalperaelaasl 

GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSKQSPASSYLENSASWYTS 

AASS inshlpppgslqhplala sgtly 

AUi.iSKTCCAMEESDSlSKTTEKENLGPRMDPPLGEPGVGSLGWVL " 
PNTAMKKKVLLMGKSGSGKTSMRSIIFANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMEN YFTS QRDNI FRNVEVLI Y VFD VESRELEKDMH Y 
YQS CLEAI LQNSPDAK I FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQLXPNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDIFTSNTYVMWMSDPSI 
PSAATLIKIRNARKHFEKLERVDG PKQCLLMR 

KQLERTCCAMEESDSEKTT EKENLGPRMDPPLGKPGVGSLGWVL " 
PNTAMKKXVLLMGKSGSGKTSMRS I IFANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFOVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQSCLSAILQNSPDAKIFCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSS I VYQLI PNVQQLEMNLRN 
FAE 1 1 EADEVLL FERAT FLVI SHYQCKEQRDAHR FSKI SN 1 1 KQ 
FKLSCSKIJWVSFQSMEVRNSNFAAFIDIFTSNTYVMVVMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLM R 

KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKXKVLLMGKSGSGKTSMRSI I FANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENY FTSQRDN I FRNVEVL I YVFDVE SRELEKDMHY 
YQSCLEAILQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTS I WDETLYKAWSS I VYQLIPNVQQLEMNLRN 
FAE 1 1 EADEVLL FERAT FLVI SH YQC KEQRDAHRFEKI S NI I KQ 
FKLS CS KLAAS FQSMEVRNSNFAAF I D I FTSW T YVMWMSDPS I 
PSAATLINIRKARKHFEKLERVDGPKQ CLLMR 
KGTEMNKSkWQSRRRHGRREHQQNPWF RLRDSBDRSDSRAAQPA 
HDSGHGDDES PS TSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 
HNNCNPLT KES I RQ KEMES KRLRLLQEEDRRKK I ARMG FNAS S M 

LRKSQLGFLNVTNYCHLAHELRLSCMERKKVQ IRS MDPS ALAS D 
RFNLILADTNSDRLFTVNDVTVGGSKYGIINLQSLKTPTLKVFM 
HEC^YFTNRKV\NSVCWASLNHLDSHILLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TG LSRR VLLTNWTGHRQS FG TNSDVLAQQ FALMAPLLFNGCRS 

GEIFAIDLRO^QGKGWKATRLFHDSAVTSVRILQDEQYLMASD 
MAGKI KLWDLRTTKC VRQY EGH VNE YAYL P LHVHE EEG I LVAVG 

QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLMAVGQDLYCYSYS 

PPIAJ^QEFGTSRRHMAAPSGVHLLVRRG SHklFSSPLNHlYLH " " 
KQSSSQQRRNFFFRRQRDISHSIVLPAAVSSAHPVPKHIKKPDY 
VTTGIVPDWGDSIEVKNEDQIQGLHQACQLARHVLLLAGKSLKV 
DMTTEE IDALVHRE 1 1 SHNAYPS PLGYGGFPKS VCTS VNNVLCH 

GIPDSRPLQDGDIINIDVTVYYKGYHGDTSETFLVGNVDECGKK 
LVE VARRCRDEAIAACRAGAP FS V I GNT I SHI THQNG FQ VCPHF 

VGHGIGSYFHGHPEIWHHANDSDLPMBEGMAFTIEPIITEGSPE 
FKVL EDAW TWS LD / TSKVSAQFEHT VL I TSRGAQ I LTKLPHEA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, (^Cysteine, D=*Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycirie, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P-Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *sStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 
6056 


421 


2364 


P P Y F LLS FLAW WLYGQS DRTETD I SQS AGP PPGTLQCS AIjHHDP~""~ 

GCANCSRPCRDCSPPACQCHTIIVPPGNALNGVQPPELSRTLALI 

SSREPPRKKKKSQTETGKERERTSFLTQGGKRFELQHGLAGICM 

TLLITGDS I VSAEAVWDHVTMANREI1AFKAGDVIKVI1DASNKDW 

WWGQ I DD E EG W FPAS F VRLWVNHEDEVE EGPS DVQNGFTLDPNSD 

CLCLGRPLQNRDQMRANVI NE I MSTERHY I KHLKDICEG YLKQC 

RKRRDMFSDEQLKVIFGNIEDIYRFQMGFVRDLEKQYNNDDPHL 

SEIGPCFLEHQDGFWIYSEYCNNHLDACMELSKUdKDSRYQHFF 

EACRLLQQM I D I A\ I DGFLLTPVQKI CKYPLQLAEI/LKYTAQDH 

SDYRYVAAALAVMRNVTQQINERKRRLENIDKIAQWQASVLDWE 

GBDILDRSSELIYTGEMAWIYQPXYGRNQQRVFFLFDHQMVLCK 

KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKLH 

NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 

NQKRQAAMTVRKV PKQKGVNSARS VP PS YPPPQDPLKHGQ YLVP 

\DGIAQ3QVFEFTEPKRSQSPFWQNFSRLTPFKK 


6057 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLAYMESKGAHRAGIiAKVIPPKEWKPRQCYDDIDNLLlPAPICX) 
MVTGQS GLFTQ YN I QKKAM TVKEFRQLANSG KYCTPR YI*D YE DL 
ERKYWIQ^TFVAPIYGADINGSIYDEGVDEWNIARLKPrvrjDVVE 
EE CG I S I EGVNTP YLYFGMWKTTFAWHTEDMDL YS INYLHFGEP 
KS W Y A I ?PEHGKRLERI*AQGF FPS S SQG CDAFLRHKMTL I S PS V 
LKKYG I ? FD KI TQE AGE FM I T FP YG YHAGFNHG FNCAES TNFAT 
VRWIDYGKVAKLCTCRKDMVKISMD I F VRKFQPDR YQLWKQGKD 
I YT I DHT KPTPAS T PEVKAW LQRRRKVRKASRS FQCARS TS KR P 
KADEEEEVSDEVDGAEVPNPDSVTDDLKVSEKSEAAVKLRNTEA 
SSEEESSASRMQVEQNIjSDHIKLSGNSCLSTSVTEDIKTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
S VAESNG VIjTEGEESDVESHGNGLEPGEI PAVPSGERNS FKVPS 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQQAPSDEELPEVLSI 
EEEVEETESWAKP1>IHLWQTKPPNFAAEQEYNATVARMKPHCAI 
CTLLMPYHKPDSSJNEENDARWETKLDEWTSEGKTKPLIPEMCF 
I YSE EN I EY S PPNAFLEEDGTS LLI S CAKCC VR VHAS CYG I PSH 

EICDGWLCARCKRNAWTAECCLCNLRGGALKQTKNNKWAHVMCA 
VAVPBVRFTNVPERTQIDVGRI PLQRLKLKCIFCRHRVKRVSGA 
CIQCSYGRCPASFHVTCAHAAGVL\MEPDDWPYWNITCFRHKV 
NPNVKS KACEKV I S VGQTV I TKHRNTR Y YS CRVMAVTSQTF YE V 

MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKI.YG 
AKYFGSNI AHMYQ VE FEDGSQ I AM KRE D I YTLDEELP KRVKAR F 
VSAGRCHLGTCQVNS 1*S S PHVS QAQQET YLGFW INS KKSQCN1 F 
LSGTY 


6058 


1 


853 


fvarlkeqegegglgprkekgrargrerrrkmqltrccfvflvq 
gslylvicgqddgppgsedperddhegqprprvprkrghispks 

RPMANSTLLGLIiAPPGEAWGILGQPPNRPNHSPPPSAKVKKIFG 

wgdfysniktvalnllvtgkivdhgngtfsvhfqhnatgqgnis 

I S LVPPS KAVE FHQEQQ IFI EAKAS KI FNC \ RME WE KVE\RGRR 

TSLFTHDPAKICSRDHAQSSATWSCSQPFKWCVYIAFYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6059 


1 


986 

1 

I 


HPl,PSASLGI,PSVSI^VSDCVRSALLEAWPMIiPKRRRARVGSP~- 

SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 

VLDACSSEATHWMEETSAEEAVSWQERRMAAAPPGCTPPALLD 
ISWLTESLGAGOPVPVECRWBT uunrncvrDT c?nj\MMTi»w7 l /»«* 
* ***** ^'•c* » rv rvn.K.Lje» v/«jir , ols.v»irJjdPAWMPAYACQR 

PTPLTHHNTGLSEALEIIiAEAAGFEGSEGRLLTFCRAASVriKAL 

PSPVTTLSQLQGLPHFGEHSSRWQELLEHGVCEEVERVRRSB/ 

^LFTQIFGVGVKTADRWYREGLRTLDDIiREQPQKLTQQQKAGEP 
5 REAGPWAS LNCTLDP SAS TP 




2 


3650 < 

c 

c 
£ 


3QDFSSIADLTDHRAHRCPGDGDDDPQLSWVASSPSSKDVASPT 
}MIGDGCDU3LGEEEGGTGItPYPCQFCDKSFIRLSYLKRHEQlH 
I D KL ? FKCTYCS RLFKHKRS RDRH I KLHTGDKKYHCHE CEAAFS 

ISDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
JHIJVKSEKEAKKDDFMCDYCEDTFSQTEELEKHVIiTRHPQLSEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F= Phenylalanine, G«*Glycine, 
H=Histidine, lalsoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVFVDENTLLAHIHQAHAMOWi:<ePMCPE\QFSSV 
\ EGVYCHLDSHRQPDSSNHS VS PD PVLGS VASMSSATPDSSAS V 
BRGSTPDS TLKPI*RGQKKMRDDGQGWTKVVYS CP YCS KRD FNSL 
AVLE IHLKTIHADKPQQSHTCQ I CLDSMPTL YNLNEHVRKLHKN 
HAYPVMQ FGNI SAFHCNYCPEM FADINSLQEHIRVSHCGPNANP 
SDGNNAFFCNQCSKGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 
VQPTQSFMEV YS CPYCTNSP I FGS ILECLTKH I KENHKNI PLAHS 
KKS KAE Q S P VSS DVE VS S PKRQ R L S ASANS I SNGE YPCNQ CDLK 
FSNFES FQTHLKLHLELLLRKQ AC PQC KEDFDSQ E SLLQHLTVH 
YMTTS TH YVC ES CDKQFS S VDD \ LQKH \LLDMPHPLCCTH C T\ L 
C0EVFDS\KVSI \QVHLAVKHSNEKKMYRCTACNWDFRKEADLQ 
VHVKHS HLGN PAKAHKC I FCG E T FS TEVELQ CH I TTHS KKYNCK 
FCS KAFHAI I LLE KHLRE KHC VFDAATENGTANG VP PMAT KKAK 
PADLQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 
VLLQNHRLRDHN I RPGEDDGS RK KAE FI KG S HKCNVCS RT F FSE 
NGLREHLQTHRG PAKH YMCPICGERFPSLLTLTEHKVTHS KSLD 
TGTCR I CKM PLQS EE E F I E HCQMH PDLRNS LTGFRCWCMQTVT 
STLELKIHGTFHMQKLAGSSAASSPNGQGLQKLYKCALCLKEFR 
SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGLAPPEPADRPCA 
GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQC1 KCQMTFENERE I Q IHVANHM I E EG I NHECKL CNQM 
FDS PAKLLCHLI EHS FEGMGGTFKC P VCFTV FVQ ANKLQQH I FA 
VHGQEDKI YDCSQCPQKF F FQTELQNHTMSQHAQ 


6060 


2145 


202 


SYEIVGKNKLEVNHSOtKALCKCSLPSRLLPLGENLPLLDRGFR 
KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
DI SASRPNILLIiMADDLG I GD IGC YGNN2WRTPN I DRLAEDG VK 
LTQHISAASLCTPSRAAFBTGRYPVRSGMVSSIGYRVLQWTGAS 
GGLPTNETTFAK I LEE KG YATGL T a KUHT .rtT .wr it q aQnucu u Dr 

HHGFDHFYGMP FS LMGDCARWELS E KR VNLEQ KLNFLFQ VLAIAT 
ALTLVAG KLTHL I PVSWMP VI WSALSAVLLLAS SYFVGALIVHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRWKHGPFLLFV 
SFLHVH I PLITMENFLGKS LHGLYGDNVKEMDWMVGRI LDTLDV 
EGLSNSTLI YFTS DHGGSLENQLGNTQYGGWNG I YKGGKGMGGW 
EGGlRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTWRIiAGSEVP 
QDRVIDGQDLLPLIiLGTAQHSDHEFLMHYCERFLHAARWHQRDR 
GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKVVHHDPPLLFD 
LSRDPSETHILTPASEPVFYQVMER \ VQQAVWEHQRTLSPVPLQ 
LDRLGNI WRPWLQPCCGPF PbCWCIiREDDPQ 


6061 


110 


1330 


IWIHMKRKTIK1JINTFENRMLMLIX3MPAVRVKTELLESEQGSPN ~ 
VHNYPDMEAVPLLLNNVKGEPPEDSLS VDHFQTQTE PVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLQ 
SNKI^HVHRIPVWQSVPVVYTAVRSPGNVNNTIVVPLLEDGRG 
HGKAQMDPRGLSPRQSKSDS DDDDLPNVTLDS VNETGS TALS IA 
RAVQEVHPS P VSRVRGNRMNNQKF PCS ISPFSIES TRRQRTVLN 
PPDSRKTAYSTDCDF\EG3CjQQKLYTKSSSPGRVHRRTHTGEKPY 

KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


6062 


71 


1079 


ETMAKNG PENUEDCH I LNAEAFKS KKICKSLKICGLVFG I LALT 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMBIDPVTR 
TE I FRSGNGTDETLEVHD FKNG YTG I YF VGLQKCF I KTQ X KV I P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LEICDNVTMYW\INPTL\ISGTFAKQLHHNFAFIILVSELQDFE 
EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENG I E FDPMLDERG YCC I YCRRGNR YCRRVCE PLLGYYP 
YP YC YQGGRVI CRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNG PENCE DCH I IiNAEAFKSKKICKSLKICGLVFGI LALT " 
LI VLFWGSKHFWPEVPKKAYDMEHTF YSNGEKKKI YMEI DPVTR 
TE I FRSGNGTDETLBVHDFKNG YTGI YFVGLQKCFI KTQI KVIP 
EFSEPEEEIDENEEI TTTFFEQSVI WVPAEKP I ENRDFLKNSKI 
[*E ICDNVTMYW\ INPTL\ ISGTFAKQLHHNFAFI ILVSELQDFE 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



913 



Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



311 



1153 



6066 



68 



6067 



858 



6068 



13 



641 



3470 



Amino acid segment containing signal peptide"" 
(A=Alanine, C«Cyeteine, D«Aspartic Acid, E« 
Glutamic Acid, P-Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine , K^Lysine, 
L«Leucine, M«Mechionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



=Stop 



E EGEDIiHFPANEKKG I EQN EQW WPQVKVKKTRHARQASEEEL P 
IKfD YTENG I EFDPMLDERG YCCI YCRRGNR YCRRVCE PLLG YYP 
YPYCYQGGRVICRVIMPCWWWVARMIjGRV 



NLPQSLPRPTHHSPPYSLEKMTDLV AVWDVALSDGVHKIEFEHG 
TTSGKRWYVDGKEEIRKEWMPKLVGKBTFYVGAAKTKATINID 
AISGPAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LEKDAMDVWCNGKKLETAGEPVDDGTETHFS IGTH\ACYIKAV\ 
SSG\KRKEGI IHTLI VDNREI PEIAS 



321 



MSVRVARVAWVRGLGASY RKGASSFPVPPPGAQGVAELLRDATG 

AEEEAPWAATERRMPGQCSV^LPPGQCSSQWGMGRGLLNYPRVR 

E LYAAARR VLG YDLLELS LHGPQE T LDRT VHCQ PAI FVAS LAAV 

E KLHHLQ PS V I ENCVAAAG FS VGE FAAL VFAGA MB FAEG 

VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW" 

BDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 

VPDGI LTRFTTNANHWFNGDGTKIAAGS SD \ FI/VKI VDVMDSS 

QQKTFRGKDAPVLSLSFDPKDIFLASASCDGSVRVWQISDQTCA 

ISWPLLQKCNDVINAKSICRIiAWQPKSGKLIAIPVEKSVKLYRR 

ESWSHQFDLSDNFISQTLNIVTWSPCGQYLAAGSINGLIIVWNV 

E TKDCMERVKHE KG YA I CGLAWHPTCG R I S YTDAEGN LGI>I»ENV 

CD PSG XTS S S KVS SRVE KD YNDIiFDGDDMSNAGD FLNDNAVE I P 

SFSKGIINDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLKT 

GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 

GSTPLHLTHRFMVWNS IGI IRCYNDEQDNAI DVEFHDTS IHHAT 

HLSNTLNYTIADLSHEAILLACESTDELASKLHCLHFSSWDSSK 
EWI IDLPQNBDIEAICIjGQGWAAAATSAIjIiLRLFTIGGVQKEVF 
SLAGPWSMAGHGEQLFIVYHRGTGFDGDQCLGVQLLELGKKKK 

qilhgdplpltrksylawigfsaegtpcyvdsbgivrmlnrglg 

NTWTPI CNTREHCKGKSDHYW WG I HENPQQLRCI PCKGSR FPP 
TLPRPAVAILS FKLPYCQ I ATEKGQMEEQFWRS VI FHNHLDYLA 
KNGYEYEESTKNQATKEQQELLMKMLALSCKLEREFRCVELADL 
MTQNAVNLAIKYASRSRKLIIiAQKLSELAVEKAAELTATQVEEE 
EEEEDFRKKLNAGYSNTATEWSQPRFRWQVEEnAEDSGEADDEE 
KPEIHKPGQNS FS KSTNS SDVSAKSGAVT FSSQGRVNPFKVSAS 
SKfiPAMSMNSARSTNILDNMGrCSSKKS'TALiSRTTNNEKSPI IKP 
L I PKP KPXQAS AAS YFQKRNS QTNKTEE VKEENL KNVLS ETPAI 
CPPQNTENQRPKTGFQMWLEENRSNILSDNPDFSDEADIIKEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 



1730 



27 



LPWQRI^VliLSRGKMAVTGWI^ESI^TAQKTAIJLQDGRRKWn?!^ - 
PDGKEMAE E Y DEKTS E LL VRKWRVKSALGAMG QWQLE VGDPAPI* 
GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 
SVDQKERCI I VRTTNKKYYKKFS 1 PDLDRHQLPLDDALLSFA\T 



PTAP 



GS KMADLANa EKPA I AP P VFVFQKDKG Q KS PAEQKNLS DSGEE P " 
RG EAEAPHHGTGH P ES AGEHALE P PAPAGAS AS TP P PP APEAQL 
PPFPREIiAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQT VPSSGTNG VS LPADCTGAVPAAS PDTAAW 
RS P S EAADE VCALEEKE PQKN ES SNAS E E EACE KKDPATQQA FV 
FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 
SLENSTNSADASSNKFVFGQNMSERVLSPPKI^NEVSSDANRENA 
AAESGS E S S S Q EATPE KES LAB SAAA YTKATAR KCLLE KVE VI T 

GEEAESNVLQMQCKL F V FDKTS QS WVERGRGLLRLNDMAS TDDG 
TLQSRLSDAGPRGSLR\LILNTKLWAQMQIDKASEK\SIRITAM 
DNEDQGVKVFLISASSKDTGQVYAALHHRILALRSRVEQEQEAK 
MPAPE PGAAP SNEEDDS DDDDVLAP S GATAAGAGDEGDGQTTGS 



PTRPGQAGSS5AMAAQRLGKRVI»S KLQ5PSRARGPGGS PGGLQK 
RHAR VT VK YDRRE LQRRLDVKKW I DGR LEE L YRGMEADMPDE IN 
I D ELLEIjES EEERSRKI QGLLKS CGKPVEDF I QELLAKLQGLHR 
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Amino acid segment containing signaJ. peptide 
(A-Alanine, Cs=Cysteine, D~Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine , N=*Asparagine, 
P=Proline, Q=Glutamine , R«Arginine, 
S=Serine, T=Threonine f V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=scop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) " 








Q \PGLRQPS PS P \DGQPSAPFQGPGARTAS PLTLLALFPG P PER 
RPALLCVLSCI 


6070 


478 


858 


I RVTVDGEFLH Y I FP LQ FbDS PEW/ RFTETHRGRHF \Q VTLTAE 
TDCRYVSWRRKKliYLL FAQHRY I SRLFSVLIGSD I ADKL YALND 
RVYIGKRYEYDI RLPNFYQMSTPEI RRSPLTQHFQNSRR YW 


6071 


2 


1654 


HEARTKGNMALARP\VRLFSLVTRLLLAPRRGLTVRSPDEPLPV 
VRI PVALQRQLEQRQSRRRNLPRPVLVRPGPLLVS ARR PEIjNQP 
ARLTI*GRWERAPLiASOGWIC < ;RI?APR'nwP<5TRPaAOPZk.DaTrD irr a 

SKGSFADI^AWKPRVLHALQE \AAPEWQ\ PTTVQSSTI PSLLR 
GREIWCAAETGSGKTLSYLLPLLQRLLG\HPSI>DSLPIPAPRGL 
VLVPSRSIAQQVRAVAQPLGRSLGLLVRDTjKGGHGMRRIRLQLS 
RQPS ADVDVATPGALWKALKSRLI SX.EQLS FLVUDEADTLLD2S 
FLELVDYILEKSH I AEGPADLEDP FNPKAQLVLVGATFPEGVGQ 
LLNKVASPEIAVTTITSSlfT.Hr'TNrDWWA'rPT tst vranm;ap7 iru 
I L KHR DRAERTGPS GTVLVFCNSS S T VNWDG Y I LDDHK I QHLRL 
QGQMPALMRVG I FQS FQKS S RD I LLCTDI ASRGLDS TGVELWN 
YDFP PTLQDY IHRAGRVGRVGSEVPGTVISFVTH P WDVSLVQKI 
ELAARRRRSLPGIiASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPT INS QLEFKS KPFPLVSSSRWLVKRGELTAY VEDT 
VLFS RRTS KQQ VYF FLFND VL 1 1 T KKKS EES YNVND YS LRDQLL 
VESCDNEELNSSPGKNSSTMLYSRQSSASHLFTLTVLSNHANEK 
VEMLLGAETQSERARWI TALGHSSGKPPADRTSLTQVE I VRS FT 
AKQPDELSLQVADWIiI\YQRVSDGWYEGER\LRDGERGWFPME 
CAKE ITCOAT I DKNVERMfTR T .r.T?T .pttov 


6073 


620 


S60 


PCRRGLARPLSRRPG/SILVHCAVGVSRSATLVLAYLMIiYHHLT " 
LVFA I KKVKDHRG 1 1 PNKG FLRQLIALDRRLRQGLEA 


6074 


166 


1110 


pgarcwatelqcpdsmpchnqqvnsastpspeOlrpgdlildha 

GGNRASRAKVIIiLTG YAHS SLPAELDSGACGGSS LNS EGNSGS G 
DSSSYDAPAGNSFLEDCELSRQIGAQLKLLPMNDQIREIiQTIIR 
DKTAS RG DFMFS ADR 1*1 R T.WP WCZT .wr»T . u vinfmin ftt mv« v v vp 

GVKFEKGNCGVSIMRSGEAMEQGLRDCCRSIRIGKILIQSDEET 
QRAKVYYAKFPPD I YRRKVT»LM YPILQTG \ NTVI E AVKVL I EHG 
VQPSVI I LLSLFSTPHGAKS I IQEFPKI TI LTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


PPTGQPQEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT " 
KLGPEI BRAECTI RMNDAPTTGYSADVGNKTTYRWAHSS VFRV 
LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFITEKRVFSS WAQLYG I TFSHPSWT 


6076 


1721 


107 


H PS PTEAPR VQHLTMDCTWR I LFLVAAATGTHAQ V<^1* VQSGAE V " 

KKPGAS VKVS CKVSG YTLTE LS MHWVRQAPGKGLE WMGAFDPE D 

GETI YAQKFQGR VTMTE DTS TDTAYMELS S LR3 EDTAVY YCATD 

HGDYAFDIWGQGTMVTVSSAPTKAPDVFPIISGCRHPKDNSPW 

IiACLITGYHPTSV\TVTWYiMGTQSQA\QRTFPEIQRRDSYYMTS 

S QLSTPLQQWRQGEYKCWQHTASKSKKEI FRWPESP KAQASS V 

PTAQPOAEGSliAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 

TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 

AHLT WEVAGKV PTGGVE EGLLE RHS NGSQ S QHS RLTL PRS L WNA 

GTSVTCTIiNHPSLPPQRLMAU^PAAO^PVKLSLNLLASSDPPE 

A\ASWIjLCEVSGFSPPNILLMWLEDHGEVNTSGFAPARPLPKP\ 

RSTrFWAXWSVURVPAPPSPQPATYTCWSHEDSRTLLNASRSL 

BVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPlFWIGIiISSVCCVFAQTDENRCLKANAKSCGECIQ' 
AGPNCX5WCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GSKD I KKNKNVTNRS KGtAEKLKPEDITQI QPQQLVLRLRSGEP 
QTFTLKFKRAED YP I DL Y YLM\ DLS YSM KDDLENVKSLGTDLMN 
EMRR I TSDFR IGFGS F VE KTVM P Y I STT PAKLRNPCTS EQNCTS 
PFS YKNVLS LTNKGE VFNE LVGKQR I SGKIiD S PEG GFDAI MQ VA 
VCGSIjI GWRlJVTRLLVFSrDAGFHFAGDGKLGGIVIjPWIX^QO^ 
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nnix.iiw flcia segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HoHistidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V= Valine, 
WaTryptophan, Y^Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


f 6078 






ENNMYTMSHYYDYPS IAHLVQKLSENNIQTI FAVTEEFQPVYKE'*" 
LKNLIPKSAVGTLSANSSNVI QLI IDAYNSLSSEVI LENGKLS E 
GVTISYQSY\ CKNGVNGTGENGR trr^NT c T^rvcimiwr c T mn»., 

CPKKDSDSFKIRPLGFTEEVEVILQYICECECQSEGIPESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDEVNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNEIYSGKFCECDNFNCDRS 
NGli I CGGNGVCKCRVCECNPNYTGSACDCSLDTSTCEASNGQI C 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYS VNGNNEVMVHWENPECPTGPDI IPIVAGWAG 
I VL I GLALI>L I WKLLM I IHDRRE FAKFEKEKMNAKWDTGENPI Y 
KSAVTTWN P K YEGK 


j 6079 


1426 


180 


^ ' Ui *JUBOwAjxv-f iLLibr ^i^KVJbi^USHNFCKKCLiEGIIjE 
GSVRNSLWRPVPFKCPTCRKKTFSYWELIPliQVNYSLKGIVEKY 
NKI KISPKMP VCKGH \ LGQPLNI F\ CL\TDMQLDL/CG I C\ ATR 

GEHTKHVFCSIEDAYAQERDAFESLFQSFETWRRGDALSRLDTL 
ETSKRKSLQLLTKDSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQAYDPEiNKLKTIIiQEQRMAFNIAEAFKDVSEPIVFLQQM 
QEFREKIKVIKETPLPPSNLPASPLMKNFDTSQWEDIKLVDVDK 
LSI>PQDTX3TFISKlPWSFYKLFLl,rLLLGI.VlVFGPTMFLEWSL 
FDDLATWKGCLSNFSS YIiTKTADFI EQSVFYWEQVTDGFF I FNE 
RFKNFTLWLNNVAEFVCKYKLL 


j 6080 


1586 


141 


ATARDLGCARRlDRVVME3TPSRGLNR\mLQCRNLQEFLGGLSP 
GVLDRLYGHPATCIiAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
v * v *^ co * v ^v*»*^«wlj&«ijKIWHTQIjLPGGLO^LIIiNPIFRQ 
NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGS PSAAVSQDIAQLLSQAGJUMKSTEPGEPPCITS AGFQFL 
LLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKD 
YS VEGMS DS L LN FLQHLRE FGL VPQRKRKS RR YYPT/RALAI NL 
SSGVSGAGGTVHQPGFIVWETNYRLYAYTBSELQIALIALFSE 
MLYPFP\NMW\ARVTR\ESVQQAIASGITAQQIIHFLRTRAHP 
VKLKQTPVLPPTITDQIRI^WELERDRLRFTEGVLYNQFiSQVDF 

ELL\LAHAPKLGVLVFB/NTPAKRLMWTPAGHSDVKRFWKRQK 
HSS 


6081 


1 


1199 


v «*-"u»m«v«/ui v SKyKAATQGLGSNQNALKYLGQDFK 
TLRQQCLDSGVL FKDP E FPACPS ALG YXDLG PGS PQTQG 1 1 WKR 
PTELCPSPQFIVGGATRTDICCXSGLGDCWLLAAIASLTLNEEIiL 
YRWPRDQDFQENYAGIFHFQPLCPPS?\FWQYGEV7VEWIDDR 
LPTKNGQLLFLHSEQGNEFWSALLEKAYAKLNGCYEAIiAGGSTV 
EGFEDFTGGI SEF YDL KKP PANIj YQI IR KALCAGS LUGGS ID VY 

SAAEAEAITSQKIjVKSHAYSVIXSVEEVNFCJGHPEKLIRLRNPWG 
SVEKSGAWSDDAPEWNHIDPRRKEEIiDKKVEDnPPWMQT cnmro 

QF|R lei c^spdslsseevhkwnlvlfnghwtrgstaggcqny 


£08S" " 


3 


865 


EMLPHiliPLPL^WA/GAlAQDARFRLEMPESVTVQEGLCIFVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLSIRDARRRDWGSYFFWVARGRTKFSY 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTIIPRPQDHGTNLICQVTFP 
3AGVTTERTIQLS VSMKSGTVEE WVLAVGWAVKI LLLCLCL I 
I LSFHKKKAVRAVE VEENVYAVMG 


6083 


283 

1865 " ■"' 


3 
1 
I 
i 
1 
f 

309 K 


SARSPGPTQTRTAPGLAAPGLAQPAALRLLLSRPPSAAMDGDGD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\LDE 
j PEPIiLA/LRVIiAALPRH E \ IiVQACR \ LVCLRWKE LVDGAPLWL 

jKCQQEGLVPEGGVEEERDHWQQFYFI»SKRRRNLI,RNPCGEEDL 

:gwcdvehggdgwrveelpgdsgvefthdesvkkyfassfewcr 

CAQVIDLQAEGYWEELLDTTQPAIWKDWYSGRSDAGCLYELTV 

^lsehenvlaefssgqvavpqdsdgggwmeishtftdygpgvr 

'VRFEHGGQDSVYWKGWFGARVTNSSVWVEP 

>Q W CAERRGLGMS1ADE LLADIjEEAAE EEEGGS YGB E EEEPAI E 
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cor re sponding 
to first 
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Amino acid segment containing signal peptide""" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K~Lysine, 
L=Leucine, M*»Methionine, N^Asparagine , 
P=Proline, Q»Glutarnine, R=Arginine, 
S» Serine, T=Threonine, V» Valine, 
W«Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6084 






UVQEETQLDLSGDSVKTiAKLWDSKMFAElMMKIiiEYISKQAKA^ 
SEVMGPVEAAPEYRVIVDTuNNLTVEIENELNIIHKFIRDKYSKR 
F PE LES Ii VPNALD Y I RTVKELGNS LDKCKNNENLQQ I DTNATI M 

WSVTASTTQGQQLSEBBLERLEBACDMALELNASKHRIYEYVE 
SRMSFIAPNIjS 1 1 IGASTAAKIMGVAGGLTNLS KMPACNIMIiLG 
AQRKTLSGFSSTSVLPHTGYIYHSDXVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTIiAARVDS FHESTEGKVGYELKDE IERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 


6085 


! 1865 


309 


1 KQWCAKKRGLGMSIADELJLiADLEEAAEEEEGGSYGKEEEEPAIE 
D VQE E TQLDLS GDS VKTI AKLWDS KM FAE I MM KI EE YI S KQAKA 
SEVMGPVEAAPEYRVIVDANNLTVEIENELNI IHKFIRDKYSJCR 
FPELESLVPNAIJDYIRTVKELGNSLDKCKNNENLQQILTNAT1M 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
SRMSFIAPNLSIIIGASTAAKIMGVAGGLTNLSKMPACNIMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSI*PPIPPPFSVAP\Db 
RRKAARLVAAKCTIiAARVDSFHESTEGKVGYELKDEIERKFDKW 
QE P P PVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGUTE IR \ KQ 
ANRMS FGE I EEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGICSTIRDRSSGTASSVAFTPLQGIjEI 

|_VNPQAAE KKVAEANQKY FS SMABFLKVKGEKSG LMST 


6086 


2 


1456 


, SGPRSFQGNKAVGRJSLGGKRKPEVTLLPGVSSERVRRWRRARV 
GVARVKPGNPWKPSPATQVPR/VPAQVYLPGRGPPLREGEELVM 

deeayvlykraqtgapclsfdivrdhlgdnrtelpltlylcagt 

OAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPQLEIAMVPHYGGINRVRVSWLGEEPVAGVWSEKGQVEVFALR 
RLLQ WSEPQAIiAAFIiRDEQAQMKP I FS FAGHMGEG FALDWS PR 
VTGRLLTGDCQKN I HLWT PTDGGS WHVDQRPFVGHTRS VEDLQW 
SPTENTVFASCSADASIRIWDIRAAPSKACMLTTATA^DGDVW 
ISWSRREPFJLLSGGDDGALKIWDLRQFKSGSPVATFKQHVAPVT 
S VE WH PQDSGVF AASG ADKQ I TQWDLG / 1 VERD P EAGDVEAD ^>G 

LADLPQQIiLFVHQGETEI»KEL»HWHPQCPGLLVSTALSGFTIFRT 
ISV 


6087 


2413 


1357 


CjAATQHGGAMNLIi PCNPHGNGLLYAG FNQDHG CFACGME1CGFR V 
YOTDPLKEKEKQEFLEGGVGHVEMLFRCWYIALVGGGKKPKYPP 
NKVMI WDDLKKKTVI EIEFSTE VKAVKLRR\DKI WVLiDSMI KV 
FTFTHNP\HQLHVFE\TCYNPKGLCVLCPNSNNSLLAFPGTHTG 
HVQLVDLASTE KP PVDI PAH EG VLS C I ALNLQGTR I ATASEKGT 
L IRI FDTSSGHIi I QELRRGSQAANT YC I NFNQDAS 1»I CVSSDHG 
TVHIFAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 
CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 


6088 


476 


1877 r 

( 
] 
I 

c 

1 a 


^SQRTGLPITIFSRSFPLLTGSDLCENMPCTCTWRNWRQWIRP"- 
LVAVI YLVS I WAVPLCVWELQKIiEVG IHTKAWFlAGI FZjLIiTI 
PIS L WVI LQHLVHYTQPELQKP I IRI LW MVP I YSLDS W I ALKYP 

3IAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPKLVLILEAKD 
2Q KHFP PLCCCP PWAMGE VLIi FR CKLG VLQ YTWR P FTT I VAL I 

:ellgiydegnfsfsnawtylviinnmsqlfamyclllfykvlk 

3ELSPIQPVGKFLCVXLWFVSFWQAWIALLVKVGVISEKHTW 

2wqtveavatglqdfiiciemfiaaia\hhytfsykpyvqeaee 

3SCFDSFLAMWDVSDIRDDISEQVRHVGRTVRGHPRKKLFPEDO 
)QNEHTSLLSSSSQDAISIAS5MPPSPMGHYO^FGHTVTPQTTP 
"TAKISDEILSDTIGEKKEPSDKSVDS 




1684 


689 "c 

K 
S 

Q 

T 


rASGL»VRI>LQQGHRCL LAP VAP KL VP P VRG VKKG FRAA FR FQKE 
.ERQRLLRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 
TAFVNSCYIKSEEAKRQQLGIEKEAVIJLNLK5NQELSEQGTSF 
QTCLTQFLEDEYPDMPTEGIKNLVDFXTGEEWCHVARNLAVE 
LTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
QMTGKELFEMWKIINPMGIiLVEEIjKKRNVSAPESRLTRQSG\A 
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j Amino acid segment containing signal peptide 
(AsAlanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Hiotidine, I=Isoleucine, K=Lysine, 
L=Leucine, MeMethionine, N»Asparagine, 
P=» Proline, Q=Glutamine, R=Arginine, 1 
S=Serine, T-Threonine, v=Valine, | 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 1 








PTALPLYFVGLYCDKKLIAEGPGBTVLVAEEBAARVALRKLYGF "1 
TENRR P WN YS KPKET1»RAEKS 1 TAS | 


6089 
" 6090 


3 


3054 


TRLGIPGSTISSRPRLCALAAEGHFIjGHSWTCJSKAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQSLVKHSSGIKGSLPLQK 
LHLVSRS I YHSHHPTLKLQRPQLRTSFQQFSSLTNLPLRKLKFS 
PIKYGYQPRRNFWPARIiATRLLKLRYLILGSAVGGGYTAKKTFD 
QWKDMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDXHFRK 
VS DKEK I DQLQEELLHTQLK YQR I LERLEKENKELR KL VLQKDD 

KGIPFIESLRKSLIDMYSEVLDVIjSDYDASYNTQDHLPRWWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
LFKDSSREFDIiTKEEDLAALRHEIELRMRKNVKEGCTVSPETIS 
LNVKGPGLQRMVLVDLPGVINTVTSGMAPDTKETIFSISKAYMQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRIQQI IEGKLFPMKALGYFAWTGKGNSSESIEAI 
RE YE EEFFQNS K2jL KTS MLKAFOVTTRNLS I, A V<?nrT?w vivrv/o w c 
VEQQADSFKATRFNLETEWKNKYPRLRELDRNELFEKAKWEILD 
EVISLSQVTP KHWEEI LQQSLWERVSTHVIENI YbPAAO/rMNSG 
TFNTT VD I KLKQWTDKQL PNKAVE VAWETLQEE FSR FMTE P KG K 
EHDDI FDKLKEAVKEES I KRHKWNDFAEDSLR VIQHNALEDRS I 
SDKQQWDAAIYFMEEALQARLKDTENAIENMVGPD\WKKRWIjYW 
KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVEVDPSLIKDTWHQVYRRHFLKTALNHCNLCRRGFYYYORH 
FVDSEbECNDWLFWRIQRMlAITANTIiRQQLTNTEVRRLEKNV 

KEVLEDFAEDGEKiaKLLTGKRVQLAEDIiKKVREIQEKLDAFIE 
ALHQEK 


6091 


194 


1560 


PVFVPAPGAVLEQAS /ASPPLATQTWPLQHCk 1 ^ prt mrn&d T f 

FELQIjFFCQLIALFVHYINlYKTVWWYPPSHPPSHTSLNFHIilD 
FNLLMVTTI VLGRRFIGS I VKEASQRGKVSLFRS I LLFLTRFW 
LTATG WSLCRS I* IHLFRT YS FLNLL/FPI,LSVWDVHS VPAAEI*R 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CMARTPCP/PHACCLS PS L IRS EVEFUCMDFNWRMKE VLVS SML 
SAYYVAFVPVWFVKNTOYYDKRWSCELFLLVSISTSVILMQHIi 
PASYCDLLHKAAAHIiGCWQKVDPALCSNVLQHPWTEECMWPQGV 
LVKKS KNVYKAVGH YNVA I PS D VSH FR FHFFFS KPLR ILNI I»1»L 
LEGAVI VYQLYS LMSSEKWHQTI SLALI LFSNY YAFFKLLRDRL 
VLGKAYSYSASPQRDLDHRFS j 


6092 


3279 


412 

J 
1 

c 

5 


s 5KTKEMEEKE I liRRQ IRLLQGL I DD YKTIiHqNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSMRKKYSLVNRPFG 
PSDFPADHAVR PLHGARGGQP P VPQQHVI#EROVOLSQGQNWI K 
VKPPSKSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPLIiVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVS ESVIAVKAS F PSSALPPRTGVALGRKLGSHS VASCAPQ 
LLGDRRVDAGHTDQPVPSGSVGGPARPASGPRQAREASLWTCR 
TNKFR KNN YKWVAAS SKS PR VARRALS PR VAAENVCKAS AGMAN 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKWKASSPSASSSS 
SFRWQSEAGSKDHASQI,SPVLSRSPSGD\RPAVGHSGLKPLSGE 
TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 
RRQALRGKS S PVLKKTPNKGLVQ VTTHRLCRLPPS RAHL PTKEA 
SSLHAVRTAPTSK VI JCTKYR I VKKTPAS PZjSAPPFPLSLPSWRA 
RRLSLSRS LVLNRLRPVASGGGKAQPGS PWWRSKGYRCIGGVLY 
ECVSANKLS KTSGQPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 
RSLAIIRQARQRREKRKEYCMYYNRFGRCNRGERCPYIHD^EKV 
WCTRFVRGTCKKTDGrCPFSHHVSKEKMPVCSYFLKGI CSNSN 
rPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 
*CPRGAQCQLLHRTGKRHSRRAATS PAPGPSDATARSRVSASHG 
PRXPSASQRPTRQl'PSSAALTAAAVAAPPHCPGGSASPSSSKAS 
3SSSSSSSPPASLDHEAPSLQEAAIAAACSNRLCKLPSFXSLQS 
jPSPGAQPRVRAPRAPLTKDSGKPLHIKPRIi 1 




143 


3190 " J 
£ 


UCAPPTGESSEPEAKVIjHTKRLYRAVVEAVHRI^LIIiCNKTA^Q 1 
^FKPENISLJ^KhRELCVKLMFLHPVDYGRKAEEl^YtRKVYYB 



444 



WO 01/53312 



PCT7US00/34263 



SEQ 
IX) 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M^Methionine , N«Asparagine, 
P=Proline, Q«Glutamine, R-Arginine, 
S=Serine f T=Threonine, V=Valine, 
W»Tryptophan, Y-Tyrosine, X«Unknown, *cStop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








VlQlxX KTNKKH I HSRS TLECAYRTHLVAG IGF YQHLLL Y1QS H Y 
QLELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDLSRYQNELAGVDTELIAERFYYQALSVAPQIGMPFNQLGTL 
AGS KY YNVE AMY C YLRC I QSEVS FEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
Ij I FQMVI I CLMCVHS LERAGSKQ YSAAI AFTLAL FSHL VNHVN I 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRSPTLEPPRGRSEAPDSLNGPIiGPSEASIASNLQAMSTQ 
MFQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDLIIVCAQSSQSLWNRliSVLLNLLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLS T LEE SWR I CC IRS FGHF I ARLQG S ILQFNPEVG I F 
VS I AQS EQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 
S QLEG S LQQPKAQS AMS P YL VP DTQALCHHL P VI RQLAT SGRF I 
VI I PRTVI DGLDLLKKBHPGARDGI RYLEAE FKKGNR YIRCQKE 
VGKSFERHKLKRQDADAWTLYKILDSCKQLT\LAQGAGEEDPSG 
MVTIITGLPLDNPSLLSGPMQAALQAAAHASVDIKNVLDFYKQW 
fCEIG 


6093 


76 


1002 


ACGRRAMLALRVART / S RWGAL \RGAVWAPGTRPSKRRA C WALL 
P P VP CCLG CLAERWRLRP AALG LRL PG I GQRNHCS GAG KAAP R \ 
PAAGAGAAAEAPGGQWG PAS T PSL Y ENP WT I PNMLSMTR I GLAP 
VLGYLI IEEDFNIALGVFALAGLTDLLDGFIARNWANQRSALGS 
ALD PLADK I L I S I L YVS LT YADLI P VPLT YM 1 1 SRD VML I AAVF 
YVRYRTL PTPRTLAKYFNPCYATARLKPTFI S KVNTAVQLIL VA 
AS LAA P VFNYADS I YLQI LW CFTAFTTAAS AY S Y YHYGRKTVQV 
IKD 


6094 


23 


i m ri 

lUiv 


WRLMAP FNMRCKTCGE YI YKGKKFNARKETVQNE VYLGLP I FR 
FYIKCTRCLAEITFKTDPENTDYTOEHGATRNFQAEKLLEEEEK 
R VQ KERED E ELNNPMKVLENRTKDS KLEMEVLENLQE L KDLNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GS LGSR P PLSRL WVKKAKADPDCSNGQ P Q A/ APHPRS PAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERS RGRGHGFLGGG FA\ S WD YF P S EDF YRCG YCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS:14!6257.I(°/oCSH01 1.DOC) 
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1 SEQ~ 

ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Ammo acid segment containing signal pepticJeT" 
(A^Alanine, C=Cysteine, D^Aspartic Acid, B« 
Glutamic Acid, F^Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N-Asparagine, 
P^Proline, Q*=Glutamine, R~Arginine, 
S»Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6096 






VKVHTVPKPGKGADLSKPPCRKAKBIRKBRKRLKLMQQNPAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQFKATLLESYQVYKRyQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLHEKTSQLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPrEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKKQQKDPS 
j BEAAVLQYASLVGQKCSERMLLFRN 


6097 


2277 


575 


QR VRAAL.US SAM EDSEALiG FEHMGLDPRLLQ AVTDLG WS R PTL I 
QBKAIPIiALEGKDLLARARTGSGKTAAYAlPMLQLLLHRKA'TCP 
WEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVANVSAA 
EDSVSQRAVLMEKPDWVGTPSRILSHLQQDSLKLRDSLELLW 
EEADLLFSFGFEEELKSLLCHLPRIYQAFLMSATFNEDVQAUCE 
LI LHNPVTLKLQESQLPGPDQLQQFQ WCETEEDKFLLL YALLK 
LS L I RG KS L L F VNTLERS Y R LRLFLEQ FS I PTCVLNGELPLRS R 
CHIISQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARANNPGIV 
LTF VL PTEQ FHLGKI EELLS GENRG P I LLP YQ FRMEE I EG FR YR 
CRDAMRS VTXQAIREARLKE I KEBLLHSEKLKT YFEDNPR \ DLQ 
LLRHDLPLH PAWKPHLGH VP DYL VP PALRGLVRPHKK\GRS CL 
PLVGRPREQSPRTHCAASSTKERNSDPQPSPPEWGPLWS 






192 


APGTMSGGKKKSSFQITSVTTDYEGPGSPGASDPPTPQPPTGPP 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVDVYERDLEPHSFGGLLEGIRGASGGAGGRSLDSRL 
ELASLGU3APTPPSGLSQGPTSWLRPPPTSPGPQARSFTGGLGQ 
LWPSKAKAEKPPLSASSPQQRPPEPETGESAGTSRAATPLPSL 
RVEAEAGGSGARTPPLSRRKAVDMRIiRMELGAPEEMGQVPPLDS 
RPSS PAIiYFTHDAS L VHKS PD P FGAVAAQKFS LAHS MLA I SGHL 
DS DDDS GSGS LVG I DNKIEQAMDliVKSHLMFAVREE VEVJbKEQ I 
RELAERNAALEQENGLIjRAIiA\SPEQLGSAGPPRGVPR\LGPPA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 

VQALSNGPWSPGPLPHLLIIPSUX3GGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ 


j 6098 


168 


1074 

1074 I 


N YCLRHRS P LEKD S S PGS S STS LLI K KQRETSDT P I MRALKELD 
EGKIFKNWGTQTEKEDTSN1NPRQTETSVNASRSPEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLS PG FS HLLS KNES 5 PIRFD I LLDDLDTV P VS TLQRTNPRKQL 

\QFIiPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWE KNKS VS YEQ CKP VS VTPQGNDFE YTAKI RTLAETER FF\ D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


f 6099 
6100 


168 




WXULRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RLKSASQRS SSLPPSNRKSSTPTKREI MLTPVTVAYS PKRS PKE 
NLS PGFS HLLS KNES S PIRFD ILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


6101 


2 


713 , 

< 
1 
} 
1 

( i 


FVJS VSO YRSRADPEPRGRDTMT*AYLFKYI I IGDTGVGKSCLLL 
2FTDKRFQPVHDLTIGVEFGARMVNIDGKQIKLQIWDTAGQESF 
*S I TRS Y YRGAAGALL VYD ITRRETFNHLTS WL EDARQHS S SNM 
/I ML I GNKSDLESRRD VKREEGEAFARE \ HGL I FMETS AKTACN 
/EEAFINTAKE I YRKI QQGLFDVHNE ANG I KIG PQQS 1 STS VGP 
3ASQRNSRDIGSNSGCC 




1 


1399 I 

C 
1 * 


•'RGRAWPLRE VSHWLGCRR VCS WSAS WGRJLPALSARtiS PLLAFR 
JKM VFPLSCAVQQ YAWGKMGSNSE VARLLAS S D PLAQ I AEDKP Y 
LELWMGTHPRGDAKILDNRISQKTLSQWIABNQDSLGSKVKDTF 
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SEQ 
ID 

, NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=?roline, Q=Glutamine, R«Arginine, 
S-Serine, T~Threonine , V«*Valine, 
W=Tryptophan, Y^Tyrosine, X=Unk:nown / **=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFLFKVbSVETPLSlQAHPNKEIAEKLHLQAPQHYPDANH 
KP EMA I ALTP FQGLCG FRP VEE I VTFLKKVP E FQFL I GDE AATH 
liXQTMSHDS Q^VASSI^S CFSHLMKSEKKVWEQLNLLVKR ISC; 
QAAAGNNMEDIFGEHjLQLHQQYPGDIGCFAIYFIjNLLTLKPGE 
AMFLEANVPHAYLKGDCVECWACSDNTVRAGIjTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSQEDPYLSIYDPPVPDFTIMKA\EVP 
G\S VTEYKDliAJjDSAS I LLMVQGT VI AS T PTTQT P I PLQRGGVL 
F IGANESVS LKLTEPKDLLI FRACCLiL 


6102 


70 


2415 


QTPQATItAANG AEDS RGG EML PAGE IGAS PAAPCCSESGDERKN 
LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 
SLKKLDKLIEQRTVSKMQIiEEQVLTISSEIPKRlRSALKNAEES 

KQFLNQFLEQETHLFSA inshlltaqpwmddlgtmisqiee ier 
HLAYLKWISQIEELSDNIQQYLMTNNVPBAASTLVSMAELDIKL 
QES S CTHLLG FMRATVKFWH KI IiKD KLTS DFEE I LAQLHW P F I A 
PPQSQTVGLSRPASAPEIYSYLETLFCQLLKLQTSHELLTEPKX 
HSQKNTLFLPPLLSS/WPIQVMLTPLQKRFRYHFRGNRQTNVLS 
KPEWYLAQVLMWIGNHTEFLDEKIQPILDKVGSLVNARLEFSRG 
LMMLVIiEKLATDIPCLLYDDNLFCHLVDEVIjIiFERELHSVHGYP 
GTFAS CMH 1 LS EETTCFQR W LTVER KFALQKMDS MLSS E AAWVS Q 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LQKDLVDDFRIRLTQVMKEETRASLGFRYCAILNAVNYISTVLA 
DWADNVFFLQLQQAALEVFAENNTLS KLQLGQIiASMES S VFDDM 
INLIiERLKHDMLTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLEKIFWQMLVEKLD 
VYIYQEIILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KHI KEACI VLNLN VGSALTAGKDVLP VQLQGS FPAT 


4ioi 


207 


2523 


ESNSTMX^YLEFIQQNEERDGVRFSWNVWPSSRLEATRMVVPW" 
ALFTPLKERPDIiPPIQYEPVLCSRTTCRAVLNPl,CQVDYRAKLW 
ACNFC YQRNQ F P PS YAG I S ELNQ P ABbtiPQFS S I EY WLRGP QM 
PLI FIiYWDTCMEDEDLQAIiKESMQMS LSLLPPTALVGL I TFGR 
MVQVHEIiGCEGISKSYVFRGTKDLSAKQLQEMLGLSKVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELORDPWPVPQGKR 
PLRSSGVALSIAVGLLECTFPNTGARIMMFIGGPATQGPGMWG 
DEL KT P I RS WHD IDKDNAKYVKKGTKH FEAIiANRAATTGHV IDI 
YACALDQTGIiLEMKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTIiEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENE IGTGGTCQWKICX3LSPTTTLAI YFEWNQHNAPI PQGG\RG 
A\ IQ FVTQY \ QHS SGQRRIRVTT I ARN \ WADAQTQ I QN I AAS FD 
QEAAA I LMARLAI YRAETEEGPD VLR WLDRQIi IRLCQKFGE YHK 
DDP S S FR FS ET FS I*YPQFM FHL»RRS S FLQ VFNNS PDESS YYRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSS I LADR I LLM 
DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHliLQAPVDDAQE 
II4HSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP ILTDDVSLQVFMDHLKKIiAVSSAA 


6104 
€105 " 


124 


732 


1 «*■ x jj^i^xx. j.i4i:riMLLKi v iij v j_»v vy.HWiiAARGVLRNYWERLLR 
KLPQSRPGFPS PP WGPALAVQ\AQPCLQSQQMI P VEVKRI /RSL 
LDS I F WMAAPKNRRTIE VNRCRRRNPQKI,I KVKNN I D VC PE CGfi 
LKQKHVLCAYCYEKVCKETAE IRRQ IGKQEGGPFKAPTIET WL 
YTGETPSEQDQGKRIIERDRKRPSWFTQN 




3 


989 


P LiHG ACTS LtVIiQR FCHRRPR P CAPAR PEDMRR PAAVPLLL L LC F 
GSQRAKAATACGRPRMLNRMVGGQDTQEGEWPWQVSIQRNGSHF 
CGGSL I AEQWVLTAAHCFRNTS ETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVAIiVELEAPVPFTNYILPVCXPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKIAVPIIDT\PR 
CNLLYS KDTEFGYQ PKTI KNDMLCAG FEEG KKDACKGDS AG PLV 

^IiVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRIIPK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide' " 
(A=Alanine, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serir.e, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown # +»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
" LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTSIESRGRPAAS 
AGLRKDRCALRRWPLRRAPLARATRRRAGSPRRCAPRPRACPQG 
WSRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAKNGRCQV 
L YKTELS KE ECCS TGR LS TSWT E E D VNDNTLFKWM I FNGGAPNC 
IPCKETCENVDCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNE CALLKAR C KEQ PELE VQ YQGRCKKTCRD V FC PGS 
STCV\ VDQTNNAYCVTCNR ICPEPASSEQYLCGNDGVTYS \ SAC 
HLR KATCLLGRS IGLAYEGKC I KAKS CED I QCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSPPISSILEW 


6107 


623 


168 


SRCSSPRPK PGRGRGK / LS P S EHRKW VE VFKACDEDHKG YLS RE 
DFKTAWMLFGYKPSKIEVDSVMSSINPNTSGILLEGFLNIVRK 
KKE AQR YRNE VRH I FTAFDT YYRGFLTLED FKKAFRQVAP KL PE 
RTVLEVFREV\ DRDS \ DGHVS F 


6108 


3 


1348 


GGSLRFSPPRVPSCSRVFCPVPPGGCGLPSPMSASRPQSPTTPW 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 1 
YYFYDLLVYWIGIFCLASATGLYSCLAPCVRRLP\SASAGESA I 
LLAPT I PNNSL P Y FHKR PQARMLLLALFC VAVS WWGVFRNEDQ 
WAWVLQDALG IAFCLYMLKT IRLPTFKACTLLLLVIjFLYDI FPV 
FITPFIjTKSGSSIMVEVATGPSDSATREKLPMVLKVPRLNSSPL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
AYG VGLL VTFVALAIjMQRGQ PALL YLVPCTLVTS CAVAL WRRE L 

gvfwtgsgfakvlppspwapapadgpqppkdsatplspqppsee 
pats pwpaeqs pksrtseemgagapmrepgs paes egrdqaqps 
pvtqpgasa 


6109 


1 


1381 


crsragaasggailegtklrrqrvdtnkpldplvpsalraaml'y 
ledylemieqlpmdlrdrftemremdlqvqnamdqleqrvseff 
mnakknkpewreeqmasi kkdyykaledadekvqlanqiydlvd 
rhlrkldqelakfkmeleadnagiteilerrsleldtpsqpvnn 
hhahs htp ve krk yn ptshhtttdhi p e kkfks eal lstlts da 

skentlgcrnnnstassnwaynvnssqplgsynigslssgtgag 

GX\TMAAAQAVQATAQMKEGRRTSSLKASYEAFKNNDFQLGKEF 
SMARETVGYSSSSAIJ^TTLTQNASSSAADSRSGRKSKNNNKSSS 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNEPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 

I 

li 


ACPSAATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGSGAFSQ 

ARSSSTGS SSSTGGGGQESQPS PLAI.LAATCSR IES PNENSNNS 

QGPSQSGGTGELDLTATQLSQGANGWQIISSSSGATPTSKEQSG 

SSTNGSWGSESSKNRTVSGGQYWAAAPNLQNQQVLTGLPGVMP 

NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQ IQI I PGANQQ 

I ITNRGSGGNI IAAMPNLLQQAVPLQGLANNVLSGQTQYVTNVP 

VALNGNI TLLPVNSVSAATLTPSSQAVTI SSSGSQESGSQPVTS 

GTTISSASLVSSQASSSSFFTNANSYSTTTTTSNMGIMNFTTSG 

SSGTNSQGQTPQRVSGLCjGSDALNIQQNQTSGGSLQAGQQKBGE 

Q\NQQTQAAPKSI,SRPQLVQGG\QALQ\AFQAAPLSGQTFTTQA 

ISQETLQNLQLQAVPNSGPIIIRTPTVGPNGQVSWQTLQLQNLQ 

VQNPOAQTITLAPMQGVSLGQTSSSNTTLTPIASAASIPAGTVT 

VNAAQLSSMPGLQTINLSALGTSGIQVHPIQGLPLAIANAPGDH 

SAQLGLHGAGGDG I HDD TAGG E EGENS PDAQPQAGRRTRREACT 

CP YCKDSEGRGSGDPGKKKOHI CHI QGCGKVYGKTSHLRAHLRW 

•iTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 

IFMRSDHLS KH I KTHQNKKGG PG VALS VGTLPLDS GAGS EGSGT 

VTPS AL I TTNMVAMEA I CPEG IARLANSGXNVKEGGQFCS P INT 

3ANGF 
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SEQ 
ID 
WO; 



Fredicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6112 



6113 



6114 



611S 



6116 



6118 



6119 



77 



1779 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



797 



196 



818 



324 
595 



567 



71 
1430 



1433 



1044 



1217 



222 



247 



462 



Amino acid segment containing signal peptide 
(TUAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serir.e, T^Threonine , V= Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide inserti on) 
KVUPRVRGAMAPWGKRIAGVRGVLLDISGVLXOSGAGGGTAIAG 



-*xw W irti-«oi\^jjA\jvKVjVLiiji;x5GVIjYDSGAGGGTAIAG 

SVEAVARLKRSRIiKVRFCTNESQKSRAELVGQLQRLGFDISEQE 
VTAPAPAACQ ILKERGLRP YLL I H DGV\ASE FDQ t DTS / STPNC 
VVI ADAGE S FS YQNMNNA FQ VLMELEKPVLI S LGKGR Y YKETSG 
LMLDVGPYMKALEYACGIKAEVGGKPSPEFFKSALQAIGVEAHQ 
AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
YVDNIiAEAVDLL LQHADK 

MSSHKSFKSKRFIiAKKQKP NRPlLQWIWIjKTGNXIRHNWir 

MBflPgMa&fV?tn>T TV r..^-. n'^ ^ - _ 



WBGRS WAACGVNI,QGAWGERSG VRA SEAESPGKRAPVSWWS RQL 
ETMVDHIiANTE INSQRI AAVESCFGASGQFLALPGRVLLGEG VL 
TKECR KKAKPR I FFLFND I LV YGS I VLNKRK YRS QH 1 1 PLE EVT 
LELLPETLQAKNRWM I KTAKKSF WSAASATERQEW ISHI EECV 
RRQI>RATGRPA\ STEHAAPWI PDKATD I CMRCTQTRFSALTRRH 
HCRKCRVWCAEPRPnR pt.t .dot .e n vmmi t^pt ^.^r-.„ 



• — v riwion x ut^k e Vi^TMTPTRTRRAAG^ATG 

PAAWSSTPRGWPGLPSTADPRPAEHL5PSQLHCPGPQEGSSRSC 
PGLRD P I P WKQVQRWG VAL SGLPVP FCWTLCP YG FTAGNAFPF ^ 
KPQNTHRSW 

PTSRPRPSFUSPAMSWSACVSAAPSS SWPASSSWPCGPRRCCTR " 
RRRCS PRCGLAAGSMCSCS PS WRCTPVPACWPS P PP \PAEQVQC 
GHLPPKADRRALRLP VAAP ARG PG PGHPAGPAG PRPARTP PAS ° 

HGPGRPTVPAPPCPLLAATEPTPSRPHQRWTRBDRMLGRGSQVT 
GRPQWFLRGLVLFSL 

UVCGRVCAHPHJLYTHIHWH IC^AC\IHTHAULCV ITASHALAH 
SHLYTCM^LTASHTPSHTHPHTAVHKE HRADVLRGTLTPLR 
. TUVMP PGRWHAA/ 1 SSSGPVFEGARA\ LQTVKKEKEDES YTPVQ 
AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLREt,CRWt7 
LRPDVIiSKAQILELLVLEQFLSILPGELRVWVQLHNPESGBEXL 
W PCWRS CRG TIiMGHPGGTRAIj P \ E PRCALDG YRS \ LRSAQ I WS L 
AS PLR S SS ALGDHLE PP YE I EARDFLAGQSDTPAAQMPALFPRE 

GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAORN 
1 LYRDVMLENYRNMASLGK U 



VGVPS PAPPCSWBVGPGGGWT PCj ILKEGQGGRRTPI^LLATRT R 
GLLSL FP PAAMHPAA FFL»P VVVAAVL WGAAP TRGL I RATSDHNA 
SMDFADLPAIiFGATLS QEGliQGFIiVEAHPDNACS P I AP 3 PPAPV 
NGSVFIALLRRFDCNFDLKVLNAQKAGYGAAWHNVNSNEIiLNM 
I VWNSEEIQQQIWIPSVFIGERSSEYLRALFVYEKGARVLLVPDN 
TFPLGYYLIPFTGIVGLLVLANGAVMIARCIQHRKRLQRNRLTK 
, \EQLKQI \PTHDYQKGDQYDVCAICLDEYEDGDKLRVLPCAHAY 
HSRCVDP WLTQTRKTCP I CKQ PVHRGPGDEDQEEETQGQEEGDB 
I GEPRDHPASERTPLLGSS PTLPTS FGSLAPAPLVFPGPS TDPPL 
SPPSSPVILV 



STlSCRACTs^ATPGAQSHRSARGHAAGG KKTAALGMERGKVKk" 
KEKEKETQKEKIGEKGREEKVKRKEVEQKIKQEKQEKQERRICGK 
EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCINTE 
I DS QME FLE IGGS KPFRS YWEM YLSN/ ADS LARS FSVGFKQDSOP 
XTWKAKKYLHQLIAANPVLPLWFANKQDLEAAYHITDIHEALA 



I UPRFVTENTTKAPAQERTTQPRSSREGTLKy'jL'MK Y bSALNPSDlT 
LR S VSNI5 SEFGRRVWTSAPPPQRP FR VCDHKRTI R KG LTAATR 
QE LXiAKAL ETLLLNG VLTL VLE EDGTAVDS BD FFQLLEDDTCLM 

I ^QSGQSWSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKONPR 
DLFGSLNVKATFYGLYSMSCDFQGL\GPKKVLRELLRWTSTLLQ 

| GLGHMLLGISSTLRHAVEGAEQWQQKGRLHSY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, Mt=Methionine, N-Asparagine, 
P~Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, WValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


liE RAGGG GLS SRALVGSGACLS LVARANG KG LPRGR KE FVBAVR 
VRYVAFR YRTPRAVCLRLWS CRRE VI MSGRGKQGGKVRAKAKSR 
S S RAGLQ F P VGR VHRLLRKGNYAE R VGAGAP VYLAAVLE YLTAE 
I LE LAGNAARDNKKTR 1 1 P RHLQLA I RNDEE LNKLLGKVT I AQG 
G\VLPNIQAVLLPKKTESQKDEGANDP | 


6121 


1612 


107 


FVRAQARGS RQP VRRPLLGAGS RLRCRS CGRME PLKVE KFATAN 
RGNGLRAVTPI,RPGELLFRSDPLAYTVCKGSRGWCDRCLLGKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRECKCLKSCKPRYPPDS 
VR LLGR WFKLMDGAPS ES EKLYS F YDLE S N INKLTE D KKEGLR 
QL VMT FQHFMRE E I QDASQLPP AFDL FEAFAKVI CNS FTI CNAE 
MQE VGVGL YPS ISLLNHSCDPNCS I VFNGPHLLLRAVRDIEVGE 
E LT I C YLDMLMTS EERRKQLRDQ YCFECD \ CFRCQTQ D KDADML 
TGDEQVWKEVQESLKKIEELKAHWKWEQVLAMCQAIISSNSERL 
PDINIYQLKVU)CAMDACINIiGLLEBAI,FYGTRTMEPYRIFFPG 
SHP VRGVQVMKVGKLQLHQGMF PQAMKNL RLAFD I MRVTHGREH 
SLIEDLILLLE/AMRRQHQSILRERSQRBIRRVSLLNALLRSHT 
LC FVS CVNLS YWKFCS VFV 


6122 


2 


2324 


RFRKMAIX3GAASQDE SS AAAAAAADSRMKNPSETSKPSMESGDG " 
NTGTQTNGLD FQKQP VP VGG AI S TAQAQAFLGHLHQVQ LAGTSL 
QAAAQSIiNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQIiMLA 
GGQITGLTLTPAQOQLLLQQAQAQAQLIiAAAVQQHSASQQHSAA 
GATI SASAATPMTQI PLSQP IQ I AQDLQQLQQLQQQNLNLQQFV 
LVHPTTNLQPA\QFI I SQTPQGQQGIiLQA\QNLLTQLPRQSQAN 
LIjQSQPRI\TI*TSQPATPTCTIAATPIQTLPQSQSTPKRIDTPS 
LEEP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGLAMVKLYGND 
FSPTTIFRFEALNLSFKNMCKLKPLLEKWLNDAENLSSDSSLSS 
PS ALN S PGI EGLS RRRKKRTS I EA\ N I RVALE KSFLEN\ QKPTS 
EEITMIADQLNMEKGVIRVWFCNRRQKEKRINPPSSGG\TSSSP 
I KAI FPSPTSLVATTPSLVTSSAATTLTVSPVLPLTSAAVTNLS 
VTGTSDTrSNNTATVI STAPPASS AVTS PSLSPSPSASASTSEA 
S S AS ETS TTQTTS TPLS S PLGTS Q VMVTASG LQTA/AQLLP FKG 
AAQLPANASIAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLAT IQALASGGS L P I TS LDATGNL VFANAGGA 
PNI VTAPLFLNP QNLS LLTSN P VS LVSAAAAS AGNSAPVAS LHA 
TSTSAESIQNSLFTVASASGAASTTTTASKAQ 


6123 


3 


2S44 


HLLHRWFGTDMQM INFTTGE FQLTEACP YLGTHS EESRFGI LKL 

HLQPLEMKRVGWFTPADYGKVTSLILIRNNLTVIDMIGVEGFG 

AREI>L KVGGRLPGAGGSLRFKVPESTlrMDCRRQLKDSKQI LS I T 

KN FKVEN I G PLP I TVS S LKING YNCQG YGFEVLDCHQFS LD PNT 

S RDI S I VFTPD FTS S W VI RDL S L VTAADLE FRFTLNVTLPHHLL 

PLCADWPGPSWEESFWRLTVFFVSIiSIiLGVTLIAFQQAQYILM 

EFMKTRQRQNASSSSQQNNGPMDVISPHSYKSNCKNFLDTYGPS 

DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 

HKTSTAAASSTSTTTEEKQTSPLGSSLPAAKEDICTDAMRENWI 

S LR YASG I NVNLQKNLTLP KNLLNKE ENTLKNT I VFSN PSS ECS 

MKEGIQTCMFPKETDIKrSENTAEFKERELCPLKTSKFCLPENHL 

PRNSPQYHQPDLPE I SRKNNGNNQQVPVKNEIVDHCENLKKVDTK 

PS S EKKIHKTSREDMFSEKQD I PFVEQEDP YRICKKLQE KREGNL 

QNLNWS KS RTCRKNKKRG VAP VS R PPEQSDLKLVCS DFERS ELS 

SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 

KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 

VDAQHFLPAGDSVSQNDFPSEAPISLNLSHNICNPMTGNSLPQY 

AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 

NGVPCVIQESAPVHNSFIDWSATCEGQFSSAYCPLBLNDYNAFP 

E ENMN YANG FPC PADVQTDF IDHNS QSTWNTPP \NM PAS \ WGNA 

QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

I amino acid 

| sequence 


rtcua^ucu end 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G*=Glycine, 
H=Histidine, I=Isoleucine, IO=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=*Arginine, 
S=5erine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=*Uhknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poosible nucleotide insertion) 


6124 






pttehsd/ thmenqa\wckeyypgf\npfraymnldiwttt\a • 

NRNAN FP1>S RDSS YCGNV 


6125 


1573 


236 


sdealriagergmgrvulfeislshgrwyspgeplagtvrvri, 
gaplpfrairvtcigscgvsnkandtawweegyfnsslsladk 
gslpagehsfpfqfllpataptsfegpfgkivhqvraaihtprf 

S KDHKCS LVFYI LSPLNLNS I PDIEQ PNVASATKKFS YKLVKTG 

S WIjTAS td lrg y wgqalq lhadvenqsgkdts p was llqkv 
sykakrwihdvrtiaevegagvkawrraqwheqilvpalpqsal 

PGCSLIHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 

rpgaaswgptpgg\psappqeeaeaeaaaggphfldpvflstks 
hsqrqpllatlssvpgapepcpqdgspashplhpplcistgatv 
pyfaegsggpvpttstlilppeysswgypyeappsyeqscggve 


6126 


1 


904 


ktcpkltcaftvsvpdsccrvcrgdgelswehsdgdifrqpanr 
earhsyhrshydpppsrqagglsrfpgarshrgalmdsqqasgt 

I VQ I VTNNXH KHGQ VCV SNGKT YSHGES WHPNLRAFG IVECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 

evhvwtirkgilqhfhiekiskrmfeelphfklvtrttlsqwki 

FTEGEAQI SQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6127 


1224 ■ 


389 


RLLSEAPCPRSRRRFQ^3NPEWGQAFVHVAVAGGLCAVAVFfGiF 
DS VS VQVG YEH YAEAP VAGLPAFIjAM P FWS LVNMAYTL LGLS WL 
HRGGAMGIiGPRYLKDVFAAMALLYGPVQWLRLWTQWRRAAVLDQ 
WLTLP I FAW PVAWCLYLDRGWRP \ WLFLSLECVSIAS YGLALLH 
PQGFEVALGAHWPAVGQAIiRT\HRHYG/ SATPSATYLALGVLS 
CLG FWLKLCDHQLARWRLFQCLTGHFWS KVCDVLQ FHFAFLFI* 
THFWTHPRFHPSGGKTR 


6128 


1335 


463 


VLPRRCLVFVVNTMDSSREPTIiGRLDAAGFWQVWQRFDADEKGY 
I EEKELDAFFLHMLMKLGTDDTVMKANLHKVKQQFMTTQDAS KD 
GRIRMKEIiAGMFLSEDENFLLLFRRENPLDSSVEFMQIWRKYDA 
DSSG F IS AAEL RN FIiRDLFIiHHfCKAI S E AKLE E YTGTMM K I FD R 

NKDGRLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKIFA 
YYDVS KTG ALEG P \ E VDGFVKDMMELVQPS I SG VDLDKFRE I LI* 
RHCD VNKDG K I Q KS EIALCI*GLKINP ' " I 


6129 


2511 


843 


TCJKMSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLBMSSKD 
SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSSRRAPPSEGTRR 
G GAARPEKTAEEGP PAA PGS L RHS GPLGPHACPTAL PEPQ VTSA 
MSSQWG I EPIjYI KAEPAS PDSPKGS S ETETEPPVALAPG \ PAP 
TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
GYHYGVASCEACKAFFKRTIQGSIBYSCPASNECEITKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 
F PAGP LAVAGGP R KTAAP VNAL VSHLLWEPE KI> YAMPDPAGP D 

GHLPAVATLCDLFDREIWTISWAKSIPGFSSLSLSDQMSVLQS 
VWMEVLVIiGVAQRSLTLQDEIAFAEYLVLDEEGARPAGLGELG\ 
AALLQLVRR LK3ALRLERE K Y VTjLXALALANS DS VHI EDEPRLWS 

SCEKLLHEALLEYEAGRAGPGGGAERRRAGRLLLTLPLLROTAG 
KVLAHFYGVKLEGKVPMHKLFLSMLEAMMD 


6130 


1764 

3 ■ " 


771 

3 
] 
< 
( 
f 

C 

577 [7J 


^FARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDIAK" 
HPCNASME CDKCQRRQKNRAFC Y FCN S VQKL P I CAQCG KTKCMM 

iCSSDCVI KHAGVYSTGLAMVGAI CDFCEAWVCHGRKCLSTHACA 

:pltdaec\vecergvwdhggrifscsfchnflceddqfehqas 

:qvleaetfxcvscnrlgqhsclrckacfcddhtrskvfkqekg 

cqppcpkcghbtqetkdlsmstrslkfgrqtggeegdgasgyda 

^knlssdkygdtsyhdeeedeyeaeddeeeedegrkdsdtess 
>lftnlnlgrtyasg yah yeeqen 

t-KGUTM RE YKWVLGSG \GVGKS ALTV\QK V TCTFI EK YDPTI E 
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SEQ 
ID 
NO: 



6131 



6132 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1811 



1241 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H-Histidine, I«Isoleucine, JO=Lysine, 
L=Leucine, M^Methionine, N=:Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insert ion ) 
DFYRKEIEV\DSSPSVAGISWTgQGTKQF\ASMRDLYIKKGQGC" 



6133 



4256 



IIiVySLVNQQSFQ\DIKPMRDQI IRVKVSEKVPVI \LVGN\SVD 
LESEREVSSSEGRALAEEWGCPFMETSAKSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACN IQ 

SSPREKTSDSSKRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS" 
PRSLSAMRLLPLAPGRLRRGSPRHLPSCSPALLLLVLGGCLGVF 
GVAAGTRRPNWLLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRASILTGKYPHNHHWNNTLEGNCSSKSWQKI 
QEPNTFPAIIiRSMCGYQTFF\AGKYLNEYGAPDAGGLEHVPLGW 
SYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVIANVSL 
DFLDYKSWFEPFFMMTATP\APHSPWTAAPQYQKAFQNVFAPRN 
KWFNIHGTNKHWLIRQAKTPMTNSSIQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSIiPIDKRQLY 
EFDIKVPLLVRGPGIKPNQTSKMIiVANIDIjGPTILDIAGYDIjNK 
TQMDGMSLLPILRGASNIiTWRSBVLVEYQGEGRNVTDPTCPSLS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALWNLQYCEFDDQEVFV 
BVYNLrADPDQITNIAKTIDPELliGKMNYRLMMLQSCSGPTCRT 
PGVFPPGYRFDPRLWFSNRG5VRTRRFSKHLL 

AAGLLPPGLVPEDPRRTRNLLPFGIQGPPFALSRPLFSCVESGW 
AWEAMEPEFLYDLLQLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRIiAALKIiEAEDIALTATSQKHKLTVVLEAVNRS\CSWRSGRP 
SGA/WESIFNKDLLSTLHX.LVALAKRFQPDLSLPTNVQVEVITI 
ESTKSGLKSEKLVEQLTEYSTDKDEPPKDVFDELFKLAPEKVNA 
VKEAI VNFVNQKLDRLGLS VQNLDTQFADGVI LLLLIGQLEGFF 
LHLKEFYLTPNS PAEMLHNVTLALELL/ IGRGPAQLPC /LALK/ 
T I VNKDAKS TLRVL YGL FCKHTQKAHRDRTPHG APN 
FVHGS MADTDIiFMECEEEEIjEPWQ KI SDVIEDS WEDYNS VDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSrSTTVSSSGAQNSDSTK 
KTLVTLI ANNNAGN PLVQQGGQPL ILTQNPAPGLGTM VTQ PVLR 
P VQVMQNANHVTSSPVASQPI FITTQGFP VRNVRP VQNAMNQVG 
I VLNVQQGQTVRP I TLVP A PGTQFVK PTVGVPQVFSQMT PVR PG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS P PAVS IAS FVT 

vkrpgvtgensnevaklvntlntipslgqspgpvwsnnssahV 

GSQRTSG PESSM KVTSS I PVFDLQDGGR K ICPRCNAQFRVTEAIi 

rghmcyccpemveyqkkgksldsepsvpsaakppspektapvas 
/thpsstpipalsppy/tkvpepnenvgdavqtklimlvddfyy 
grixmkvaqltnfpkvatsfrcphctkrlknnirfmnhmkhhv^ 

IiDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 

kicewafeseplflqhmkdthkpgempyvcqvcqyrsslysevd 
vhfrmihedrrhllcpyclkvfkngnafqqhymrhqkr\nvyh\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVpVSSNDTPPSALQEAAPLTSSMDPLPVFIiYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 
WKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
S TATPPPTPTHPQAIiAIjPPLATEGAECIiNVDDQDEGS PVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRVVLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVIiTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMrVAIDEISLFL 
DTE VIjSS DDRKENALQTVGTGE P WCD WLAI LADGTVL PTIj VF Y 

RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVIAMLSASSTLPAVVPAGCSSKIQPL 
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SEQ 
ID 
MO: 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
H=Histidine, I^lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=iArginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\apoosible nucleotide insertion) 








DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGV I GDC P EL VQRS FL VAS VLPG PDGN INS PTRN ADMQEEL I AS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6X34 


2 


4256" 


FVHGSMADTDLFiyEECEEEELEPWQKISDVIBDSWEDYNSVDKT 

TTVSVSQQPVS APVP IAAriASVAGHLSTSTTVSSSGAQNSDSTK 

KTL VTLI ANNNAGNPLVQQGGQ P L I LTQN P APGLGTMVTQ PVLR 

P VQ VMQNANHVTSS PVAS QP I F I TTQGFPVRN VRP VQNAMNQ VG 

IVLNVQQGQTVRPITIiVPAPGTQFVKPTVGVPQVFSQMTPVRPG 

STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 

TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPS PPAVS IASFVT 

VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPWVSNNSSAH\ 

GSQRTSGPESSMKVTSS I PVFDLQDGGRXICPRCNAQFRVTEAI, 

RGHMC YCCPEMVEYQKKGKSDDSE PS VPSAAKP PS PEKTAPVAS 

/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 

GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 

LDQQNGE VDGHTI CX3HCYRQFSTPFQLQCHLENVHS P YESTTKC 

KICEWAFESEPIiFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 

VHFRM I HEDTRHLLCP YCLKVFKNGNAFQQHYMRHQKR \ NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGIiKPGTKVTIRA 

SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFIjYPPVQRS 

IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSIiCRYST 

CCS RAYANHM INNHVPR KS PKYLALF KNS VSG I KLACTS CTF VT 

S VGDAMAKHLVFNPSHRSS S I LPRGLT WI AHSRHGQTRDRVHDR 

NVKNMYPPPS FPTNKAATVKS AGATPAEPEELLTPLAPALPS PA 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQBPE 

LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 

RRIRRWLRRFQASQGENLEGKYLSFEAEEKIAEWVLTQREQQLP 

VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 

VAHTLPKDVAENAGLFlDFVQRQIHNQDLPLSMIVAIDEISLFIi 

DTEVLS S DDRKENAIiQTVGTGEP WCDWLAILADGTVLPTLVFY 

RGQMDQPANMPDS ILLEAKESG YSDDE I MELWSTRVWQKHTACQ 

RSKGMLVMDCHRTHLiSEEVLAMLSASSTLPAWPAGCSSKIQPL 

DVCI KRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGE V 

LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESIiHQLPEGESETES 
FYGFEEADLDLME I 


613S 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVI EDS WEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHIiSTSTTVSSSGAQNSDSTK 
KTLVTLI ANNNAGKP LVQO^SGQ PL I LTQNPAPGLGTMVTQP VLR 
PVQVMQNANHVTSS PVASQP I FI TTQG FP VRNVRP VQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
S TMP VRPTTNTFTT V I PATLT IRS TVPQSQSQQTKS TPS XS TTP 
TATQPTSLGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS IAS FVT 
VKRPGVTGENSNE VAKLVNTLNTI PSLGOS PRPVW<?Mweea u \ 
GS QRTSGPE S S MKVTS S I PVFDLQDGGRKI C PRCNAQFRVTE AL 
RGHMCYCCP EMVE YQ KKGKS LDS E PS VPSAAKP PS PEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPWENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGE VDGHTI CQHCYRQFSTPFQLQCHLEKVHSP YESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VMGRQTCLECS FE I PDFPNHFPT YVHCSLCR YST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGI KLACTS CTFVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C«Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H*»Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








S VGD AMAKHL VFNPS H RSS S I L PRGLiT W I AHS RHGQTRDRVHDR " 

** » »v*ii 'iff r j? ivtr<*\t\x VI\.o/i\j/\i xrAliPlLilLiIjTPJUAPAuPSPA 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
I1ASGGGGSGGVGKKEQLSVKKLRWI.FALCCNTEQAAEHPRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQIjP 
VWEETLFQKATKIGRSIiEGGFKISYEWAVRFMLRHmiTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLSSDDRKENALQTVGTGEPWCDWLAI LADGTVLP TLVFY 
RGQMDQPANMPDSIIiLEAKESGYSDDEIMELWSTRVWQKHTACQ 
R S KGMLVMD CHRTHLS E E VLAMLS ASSTLP A WPAGCS S Kl QPL 
v ravr onA.zvwi^Cjy>iKJSri/iLil ALUoDVIjI^LVLVWLGEV 
LGVIGDC PEL VQRS FLVAS VLPG PDGN INS PTRNADMQEE L IAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
F YGFEEADLDLME I 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRIiGQVAS" 

SLFRGEHHSRGGTCRIASLFSSLEPQIQPVYVPVPK\ESALASA 

DLEEEIHQKQGQKRKNSQPGVKVADRKILDDTEDTWSQRXKIQ 

INQEEERLKNERTVFVGNLPVTCNKKKLKSFFKEYGQIESVRFR 

SLIPAEGTLSKKLAAIKRKIHPDQKNINAYWFKEESAATQALK 

RNGAQ I ADG FR I RVD LAS ETS SRDKRS VF VGNLP YKVE ES Al EIK 

HFLDCGSIMAVRI^DiOMTGIGKGFGYVLPBNTDSVHLALKLNN 

SELMGRKLRVMRSVNKEKFKQQNSNPRLKNVSKPKQGLNFTSKT 

AEGHPKSLFIGEKAVLLKTKKKGQKKSGRPKKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG 
MWNML I VAM CLA\ LLGL PG KAQELQGHVS \ 1 1 LAG EQLGDLAKK 
YLWQG\LFQLYLDEAGRGHSFSFHGAALTAPKQGQELMAKALES 
LSCPKDMAPSHCAEHKDQFLQLSQYRQLKTAEDYQALNKDIEAQ 
LQHAGLREAGGIFYFSVPPFAYEDIARNINSSCRPGPGAWLRW 
LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQIL 
PFRDQNRKALDGLWNRHHVERVEIIMKETVDAEGRTSFYEEYGV 
*svw v jisjviaxj J-cvLit uvAMJSij ^llNV&bAEAVLRHKLQVFQALRGL 
QRGSAWGQ YQS YSEQVRRELQ KPDS FHSLTPTFAGVL VH I DNL 
R WEGVP F ILKS G KALDER VG YARILFKNQACC VQS E KHWAAAQS 
QCLPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWXEMEGPPG 
LRLFGSPLSDYYAYSPVRERDAHSVLLSHIFHGRKNFFITTENL 
LASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEEL I S KL 
ANDIBATAVRAVRRFGQFHLALSGGSS PVALFQQLATAHYGFPW 
AHTHLWLVDERCVPL3DPESNFQGLQAHLLQHVRI PYYNIH \AM 
P VHLQQRLCAE EDQG AHI YAR E I S ALGANS S FDLVLLGMGADGH 
TASLFPQSPTGLDGEQLVVLTTSPSQPHRRMSLSLPLINRAKKV 
AVLVMGRMKRE I TTL VS R VGH E P ECKWP I S G VLPHSGQLVW YMD Y 
DAFLG 


6138 


4587 


934 

] 
1 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLFRFL 
TDTS HLLS AVKGQ ER FS LYQTRS L IHE L KWKE I H FQR RRTTCAL 
TLEAGEICLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQSrVETWDQCEKKIKELKSRLOVLKAQSEDPLPELHEDLHNEK 
ELIKELEQSLASWTQNLKELOTMKADLTRHVLVEDVMVLKBQIE 
HLHRQWEDLCLRVAIRKQEIEDRLNTWWFNEKNKELCAWLVQM 
ENKVLQTADISIEEMIEKLQKDCMEEINLFSENKLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETFAFI 
QQLDKNM SNLRTWLAR I ESEL S KP WYDVCDDQEIQ KRLAEQQD 
LQRDIEQHSAGVESVFNrCDVLLHDSDACANETECaDSIQQTTRS 
UDRRWRN I CAMSMERRMKI EETWRLWQKFLDD YSRFED WLKS AB 
^TAACPNSSEVLYTSAKEELKRFEAFQRQIHERLTQLELINKQY 
^LARENRTDTASRLKQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G«Glycirie, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








NQREBPEGTRESIIiVWLTBMDLQLTNVEHFSBSDADDKMRQLNG 
FQQEITLNTNKIDQLIVFGEQLIQKSEPVLDAVLIEDEIiEELHR 
YCQEVFGRVSRFHRRIiTSCTPGLEDEKEASENETDMEDPREIQT 
DSWRKRGESEEPSSPQSLCHLVAPGHERSGCETPVSVDS\ I PLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRKTVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEGPRVLNGNPQQEDGGLAGITEQQSGAFDRWEMIQAQEL\HNK 
LKIKQNLQQLNSDISAITTWLKKTEAELEMLKMAKPPSDIQEIE 
LR VKRLQE I LKAFDT YKALVVS VNVSS KE FLQTES PESTELQSR 
LRQLSLLWEAAQGAVDSWRGGLRQSLMQCQDFHQLSQNIiLLWIA 
S AKNRRQKAHVTD P KAD PRAL L ECRR E LMQLE KE LVERQPQVDM 
LQEISNSLLIKGHGEDCIEAEEKVHVI\EKKLKQLREQVSQDLM 
ALQGTQNPAS PLPS F DEVDSGDQPP ATS V PAPRAKQFRAVRTT E 
GEEETESRVPGSTRPQRSFLSRWRAALPLQLIiLLLLLLIiACLIi 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDWVWSRTCGVLETPTSVIiRRARARGPCPTDSKWALPRLREGE 
T ERR P WEAS S WKTL / LAG W I GG AAS VI VGH P LDTVKTRLQ AGVG 
YGNTLS C I R WYRRESMFGFFKGMS FPLAS I AVYNS WFGVFSN 
TQRFLSQHRCGEPEASPPRTLSDLLLASMVAGWSVGLGGPVDIi 
IKIRLQMQTPPVSGRQPRFEVQGSGSCG\EPAYQGPVHCITTIV 
RNEGLAGLYRGASAMLLRDVPGYCIiYFIPYVFIiSEWlTPEACTG 
PS P CAVWIAGGMAG A I S WGTATPMD WKS RLQADGVYLNKYKGV 
LDCISQSYQKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


694 


136 


RPELELWRLRSRSWRPLGVPRRCHRRNWKBPVRAQPLSVTVWAP 
RCQRP/QPPAPEPSSPNAAVPEAIPTPRAAASAAIiELPLGPAPV 
S VAPQARAEARSTPG PAGS RLGPET FRQRFRQ FR YQDAAGPREA 
FRQLREL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAIIjPEAAR 
ARRIRRRTDVRITG 


6141 


2 


984 


AQ VGPRSRPCKM PLKLRGKKKAKS KETAGLVEGEPTGAGGGS jJs ™ 
ASRAPARRLVFHAQLAHGSATGRVEGFSSIQELYAQIAGAFEXS 
PSE I LYCTLNTPKIDMERLLGGQLGLEDFI FAHVKGIEKE VNVY 
KSEDS LGLTI TDNGVG YAF I KRI KDGGVIDS VKTICVGDH IES I 
NGENIVGWRHYDVAKKLKELKKEELFTMKLIEPKKAFEIELRSK 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLELYMGIRDIDLATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DEFVFD VWGVI GDAKRRGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETAR IGPGVMES KEERALNN 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FR VRQ P I LQ YRWDI MHRLGE PQARMREENMERIGE EVRQkME KL 
REKQLSHSLRAVSTDPPHHDHHDEFC\ LMP i 


6143 


2802 


270 


FRMRIFLHCPWNQQMWKiWNLLETSLESCKAHLSIQKIiLKER\Q 
\QLP VFKHRDS I VETLKRHR WWAG ET \GSGKS TQ VPHFLL ED 
LLLN E V7EASKCN I VCTQ PRR I SAVS LANRVCDE LGCENGPGGRN 
SLCGYQIRMESRACESTRLLYCrrJX5VLLRKLQEDGIiLSNVS/HM 
FIVDEV\HER\SVQSDFLLIILKEILQKRSDLHLILMSATVDSE 
KFSTYFTHCPILRISGRSYPVEVFHLEDIIEETGFVLEKDSEYC 
QKFLE EEEBVTINVTSKAGG I KKYQEYI P VQTGAHADLNP FYQK 
YSSRTQHAILYMNPHKINLDLILELLAYLDKSPQFRNIEGAVLI 
FLPGLAHIQQLYDLLSNDRRFYSBRYKVIALHSILSTQDQAAAF 
TLPPPGVRKIVIiATNIAETG ITI PDWF VI DTGRTKBWKYHES S 
QMSSLVETFVS KASALQRQGRAGR VR DGFCFRM YTRERFEG FMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDFLSKALDPPQLQVISN 
AMMLLRKIGACELNEPKLTPLGOHLAAIiPVNVKIGKMLI FGAI F 
GCLDP VATLAAVMTE KS PFTTPIGRKDEADLAKSALAMADSDHL 
TIYNAYLGWKKARQEGGYRSEITYCRRNFLNRTSIiIjTLEDVKQE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D»Aspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K*Lysine, 
L^Leucine, M^Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=x Threonine, V=Valine, 
W=Tryptophan, Y~Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LI KLVKAAGFSS STTSTSWEGNRASQTLS FQEI ALLKAVI>VAGL 
YDNVGKI I YTKS VDVTBKLACI VETAQGKAQVHPSSVNRDLQTH 
GWLLYQE K I R YARVYLRETTL I TPFPVLLFGGDI EVQHRERLLS 
IDGWI YFQAPVKI AVI FKQLRVL I DS VLRKKLENP KMSLEND K I 
LQIITELIKTENN 


6144 


1289 
» 


568 


SG PG5MSGQRVDVKWMLGKE Y VGKTS LVE R YVHDR FLVG P YQN 
VSASGGARHGGRGSGGPVICTYGPDI*FPLVA\riGAAFVAKVMS 
VGDRTVTLGIWDTAGSERYEAMSRIYYRGAKAAIVCYDLTDSSS 
FERAKFWVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV 
QDYADNIKAQLFETSSKTGQSVDELFQKVAEDYVSVAAFQVMTE 
DKGVDLGQKPNPYFYSCCHH 


614S 


1109 


196 


GGMDLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVI, 
GPMVYAICYCPIiPRLADbEALKVADSKTLLESERERLFAKMEDT 
DFVGWALDVLSPNLISTSMDGRVKYNIjNSLSHDTATGLIQYAU[) 
QGVNVTQVFVDTVGMPE TYQARLQQSFPG IE VTVKAKADALYPV 
\VSAAS 1 CAKVARDQAVKKWQFVEKLQDLDTDYG\SGYPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSOARPRSSHRYFLERGLESTTS 
L 


6146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVIjTDEQKS 
R/ YPGHEAHDQGG\WDARQSI IRKWDPETGRTRLI KGDGEVLE 
EI VTKERHRE INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDS PEGPEGEAPERRRKAHGMDKLYYGLSE "' 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPIAQLMDS 
ETDMVRQIRALDSDMQTLVYENYNKFISATDTIRKMKNDFRKME 
DEMDRLATNMAVITDFSAR ISATLQDRHERITKLAGVHALLRKL 
QFLFELPSRLTKCVELGAYGOAVRYQGRAQAVLQQYQHLPSFRA 
IQDDCQ VITAR LAQQLRQR FREGGSGAP EQABC VEIjLIjALiGE PA 
EELCEE FLAHARGRLE KE LRNLEAELG PS PPAP D VLEFTDHG \ S 
SGFVGGLCQVAAAYQEIiFAAQGPAGAEKLAAFARQLGSRYFALV 
ERRLAQEQGGGDNSLLVRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERLGHHLQGLRAAFLGCLTDVRQALAAPRVAGKEG P 
GLAELLANVAS S I L,SH I KAS LAAVHL FTAKEVS FSNKPYFRGE F 
CSQGVREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLLS 
RLCLDYETATISYILTLTDEQFLVQDQFPVTPVSTLCAEARETA 
RRLLTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPRLAGVA3UTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMC I WASHGASS VARASVREPQGNKSPRMNTKRAGECLCPRS 
CS FSAQD YDI FAP I L P VE KQRLR VTQE VRAGI^VLVLK I RPQTNS 
CILPLPHSTGS INSDHVPTK 


6148 


3056 


353 


VPAVGGTFADGAMGEAEKFHYI YSCDLD INVQLKI GSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPIiALPVRT 
SYKAFSTRWNWNEWLKLPVKYPDIjPRNAQVAIiTIWDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSQMDQKPTKTPGRT 
SSTLSEDQMSRIAKLTKAHRQGHMVKVDWLDRLTFREIEMINES 
VKRSSNFMYLMGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNI VS Y P P S KP PT YEEQDtiVWEFR Y YLTNQDKALT K I LTS V I W 
DLPQGAKQALALLGKWKPMDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGIjEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDI*CTFLI SRAS KNSTLAN YL YW YVI VECEDQDTQQRDP K 
THEMYLNVMRRFSO^LKGDKSVRVMRSLI^QO/TFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKMNLSDVELIPIiPLEPQVK 
IRGI I PETATLFKSALMPAQLFFKTEDGGKYPVI FKHGDDLRQD 
QLI LQI ISLMDKLLRKENIiDLKLTPYKVLATSTKHGFMQFIQSV 
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ID 
NO: 


Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
<A~Alanine, C*Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=* Phenyl alanine, G^Glycine, 
H-tiistidine, I=Isoleucxne, K=Lysine, 
L-Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T= Threonine, V=Valine / 
W«Tryptophan, Y-Tyrosine, X»Unknown, **»stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








P VAE VLDTEGS I QN F FRK YA P S ENG PNG I SAEVMDT YVKS CAG Y 
CVITYI LGVGDRHLDNIjLLTKTGKLFH I DFGYILGRDPKPLP PP 
MKLNKEMVEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNLILNLF 
S LM VDANI PDI ALEPDKTVKKVQDKFRIiDLSDEEAVHYMQSL I D 
ESVHALFAAWEQIHKFAQYWRK 


6149 


1 


1413 


RVDPRVRENGTANP I KNGKTS PAS KDQRTGKKTS VQGQ VQKGND 
ESESDFESDPPSPKSSEE EEQDDEE VLQG EQGD FNDDDTE P EN L 
GHRPLLMD S ED EEEEE KHS SDS D YEQAKAK YS DMS S VYRDRS GS 
GP TQDLNTI IiLTSAQliSS D VA VETPKQE FD VFGAVPFFAVRAQQ 
PQQE KNEKNL PQHRF PAAGLEQE E FDVFTKAP FS KK VNVQE CHA 
VGPEAHTIPGYPKSVDVFGSTPFQPFLTSTSKSESNEDLFGLVP 
FDErTGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
STKKTLKPTYRTPERARRHKKVGRRDSQSSNEFLTISDSKENIS 
VALTDGKDRGNVLQPEESLI*DPFGAKPFHSPD\ljSWHPP\HQGtj 
S\DIRADHNT\VI,PGR\PRQNSLHGSFHSADVLKMDDFGAVP/F 
LTELWQSITPHQSQQSQPV\ELDPFGAAPFPSKQ 


6150 


372 


37 


MSNI KKYI 1 D YDWKAS I E I E t DHDVMTEEKLHQINNF WSDSE YR 
LNKHGSVLNAVL1MLAQHALLI AI S SDLNAYG WCE FDWNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF 


6151 


1555 


521 


DSNQQSVSGTAASTLLHS FKATI Y YQGTGHVQQF YGVTS PYSQT 
TPPIVQSYAQPSLQYIQGQQIFTAHPQGVWQPAAAVTa-lVAPG 
QPQPI^PSEMVVTNNLLDLPPPSPPKPKTIVLPPNWKTARDPEG 
KIYYYHVITRQTQWDPPTWESPGDDASLEHEAEMDLGTPTYDEN 
PMK\ASKKPKTAEADTSSEI*AKKSKEVFRKEMSQFIVQCI,NPYR 
K PDC KVG \ R I TTTEDFKHLARKLTHGVMNKELKYCKNPE \ D L EC 
NENVKHKTKE Y I KKYMQKFGAVYKPKEDTEFRVTVGPGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVIjLMIHHPGEFNPADVN 


6152 
" 6153 " 


1366 


648 


NRTWSTPSTWMGVAIiPPLCSTGPWPVTRQITARTTCXSAVPAKCP ~ 

pwc/dvheprcqppdchghgtcvdghcqctghfwrgpgcdeldc 
gpsncsqhglctetgcrcdagwtgsncseecplgwhgpgcqrpc 
kcehhcpcdpktgncsvsrvkqclqppeatlragelsfftrtaw 

LALTLALAFLLL1 STAANLS LLLSRAERNRRLHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD ~ i 




2 


3368 

i 

5 


GRVGARSPGRAYALLLLIjICFNVGSGLHLQVLSTRNENKLLPKH 

PHLVRQKRAW ITAP VALLEGEDLSKKNP IAKIHS DLAEERGLKI 

TYKYTGKG I TEP PFG I FVFN KDTGELNVTS I LDREETPPFLI/TG 

YATjDARGNNVEKPLEIjRIKVX.DINDNEPVFTQDVFVGSVEELSA 

AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT 

GEIYTTSVTIJDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQIR 

ILDVNDNIPWENKVLEGMVEENQVNVEVTRIKVFDADEIGSDN 

WLANFTFASGNEGGYFHIETDAQTNEGIVTL1KEVDYEEMKNLD 

FS VI VANKAAFHKS I RS KYKP T P I P I KVKVKNVKEGIH FKS S VI 

S I YVSESMDRSSKGQI IGNFQAFDEJDTGLPAHARYVKLEDRDNW 

I S VDS VTS E I KLAKL PDFES R YVQNGTYTVKI VAl S ED YPRKT I 

TGTVLINVEDINDNCPTLIEPVQTICHDAEYVNVTAEDLDGHPN 

SGPFSFSVIDKPPGMAEKWKIARQESTSVLLQQSEKKLGRSEIQ 

FL ISDNQGFSCPEKQVLTLTVCEVLHGS \ GCREAQHDS YVGLGP 

AAIALMILAFLLLLLVPLLLLMCHCGKGAKGFTPIPGTIEMLHP 

WNNEGAPPEDKWPSFLPVDQGGSLVGRNGVGGMAKEATMKGSS 

SAS IVKGQHEMSEMDGRWEEHRSIjLSGRATQFTGATGAI \mtte 

TT I TARATGASRDVAGAQAAAVALN BE FLKN Y FTDKAAS YTEED 

ENHTAKDCLLVYSQEBTESLNASIGCCSFIEGELDDRFliDDLGI, 

KFKTLAEVCLGQKIDINKEIEQRQKPATETSMNTASHSLCEQTM 

imSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSRQAQK 

/ATPLPDP MASRNVI ATETS YVTGS TM PPTTV I LGPS QPQ SL I V 

rERVYAPASTLVDQPYANEGTVVVTERVIQPHQGGSNPLEGTQH 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
c o r r e s pond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment, containing signal peptide 
(A=Alanine, C=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un known, *=Stop 
Codon, /epossible nucleotide deletion, 
\=possible nucleotide insertion) 










JLQDVPYVMVRER2SFLAPSSGVQPTIiAMPNIAVGQNVTVT2RVL 
APASTLQS S YQI PTENSMTARNTTVSGAGVPGPLPDFGLESSGH 




6154 


3660 


214* 


KKKTKMKNTLQKTVNFGAWPKPT I SDKSHLLQMVSKLDI»TDAKN 
SDTAHIKSIEITSILNGLQASESSAEDSEQEDERGAQDMDNNGK 
EESKI DHLTNNRNDLIS KEEQNS SSLUEENKVHADLVIS KPVSK 
SPERLRKDIEVLSEDTDYEEDEVTKKRKDVKKDTTDKSSKPQIK 
RGKRRYCNTEECIiKTGSPGKKEEKAXNKESLCMENSSNSSSDED 
EEETKAKMTPTKKYNGI*EEKRKSLRTTGFYSGB'SKVAEKRIKLL 
NNSDERLQNSRAKDRKDVWSSIQGQWPKKTLKELFSDSDTEAAA 
S P PHPA PEEG VAEESLQTVAEEESCS PSVELE KP PPVNVDS KPI 
EEKTVEVNDRKAEFPSSGSNFSA* I PLPYLHLNRLHQSL *QKGS 
RQQS S VTVS EPIiAPNQEEVRSI KSETDSTI E VDSVAGBLQDLQS 
ERE * LASRF* CQCELEQ * * S ARTRTS * KSL YRSEKSERCSGRRK 
FIKKAEKKP*SNSGKQQKEGK 




6155 


869 


121 


HLLPELRGKS WXTMKY VFYLGVLAGTFFFADSS VQKEDPAP YLV 
YL KSH FNPCVGVl*! KPSWVLAPAHCYLPNLKVMLGNFKSRVRDG 
TEQTINPIQIVRYW^SHSAPQDDLMIilKLAKPAMLNPKVQALN 
P\PTTNVRPGTVCLLSGLDWSQENSGRHPDI,RQNLEAPVMSDRE 
CQKTEQGKSHRNSLCVKFVKVFSRIFGEVAVATVICKDKliQGIE 
VGHFMGGDVGIYTNVYKYVSWIENTAKDK 




6156 


5725 


3984 


GTSTVTMATKKHFSIILNLIiGMLLKKDNQDTRKLLMTWAIiEVAV 
VMKKS ETYAPLFCLPS FHKFCKGLIiADTLVEDVNI CLQACSSLH 
PlLSS SLPDDLLQRCVDVCRVQLVHRGTCI RQAFGKLLKS I PLGV 
FLSNNNHTEIQEISLALRSHMSKAPSNTFHPQDFSD/VISFILY 
GNSHRTGKDKWLERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 
AIWEAAQFTVLS KLRTPLGRAQDTFQTIEGI I RSLAGHTLNPDQ 
DVSQWTTADNDEGHGNNQLRLVLLLQYIjENLEKLMYNAYEGCAN 
ALTS P PKVIRTFL YTNRQTCQDWLTRIRIjS IMRVGLLAGQPAVT 

vrhgfdlltemkttslsqgnelevs i mm vvealcelhc p ea i qg 
iawsssivgkhllwinsvaqqaegrfekasveyqehl-camtgv 
dccissfdksvltlasagcksaslkhclngesrksvlskptdss 
pevinylgnkace cyistadwaavqewqnaihdlkkstssts ln 
lkadfnyikslssfesgkfvecteqlellpgeninliaggskek 
idmkkllrnm 


6157 


946 


329 


MAN RGPS YC3LSRE VQEK I EQK.Y DADLENKIjVDW 1 1 LQCAEDIEH 
PPPGRAHFQKWLMDGTVIiCICjINSIiYPPGQEPIPKISESKMAFK 
QMEQI S OFLKAAETYGYRTTD I FQTVDLWEGKDMAAVQRTLMAIi 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGMTG YGMPRQI M* DAASCP 


6158 


441 


1482 


LGSLIVLSLHCKVIFSSQSLERAMKEKAVDLVPILAQNPGIAQN 
r xjjc»Vjj\iyjtiNUr4 l^vDFXXDHVQDKKTD/SRSKSPHKKRSKSRER 
RKSRSRSHSRDKRKDTREKIKEKERVKEKDREKEREREKEREKE 
KERG KNKDRD KEREKOREKD KE KDREREREKEHEKDRDKEKE KE 

QDKEKEREKDRSKEIDEKRKKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRSSSRSPRTSKTIKRKSSRSPSPRSRNKKDKKREKERD 
HISERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
SS VS KEVDDKDAPRTEENKI QHNGNCQLNEENIiSTKTEAV 


6159 ■ 


53 . 


84 


AVIAPLHISIXSDRARPYLKNTEKSSTTCSRRRNQSFPPVMSIjTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
IITGTPIIjTFVKDPQLEVNFYTGMDEDSDIAFQFRLHFGHPAIM 

ns^fgiwryeekcyylpfedgkpfelciy^hkeykvmvngqr 
iynfahrfppasvkmlqvfrdisltrvlisd*grcvritavqef 
dvsvscdcttayqpg 


6160 


1626 


1790 


agakffp*f*kvadaqptbsekeiynqvnwi,kdaegiledlqs 
yrgagheireaiqhpadeklqekawgawplvgklkkfyefsqr 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=»His t idine, I=Isoleucine, K= Lysine , 
L= Leucine, M^Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VeValine, 
W=»Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLGALTSTPYSPTQHLEREQALAKQFAEILHFTTjRFD 

ELKMTNPAIQNDFSYYRRTLSRMRINNVPAEGENBVNNEIiANRM 

Z>Jjb iAcATPMIjKTLSDATTKFVSENKNLPiENTT^ 

RVMLETPE YRS RFTNEETVS FCLRVMVGVI ILYDEVHPVGAFAK. 

TSKIDMKGCIKVLKDQPPNSVEGLtJ!IALRYTTKHLNI)ETTSKQI 

KSM1»Q*QLLTLVNKG 


" 6161 


455 


1569 


PVSGSES SLRRAWAS I LRLMLGPRVAVS ILCEDGISH*LLEKH* 
KS H VLEP LS S LALEEQCLALS LD WSTG KTGRAGDQPLK 1 1 S SD S 
TGQLHLLMVNETRPRLQKVASWQAHQFEAWIAAFNYWHPEIVYS 
GGDDGLLRGWDTRVPGKFLFTSKRHTMGVCS I QSSPHREHILAT 
GS YDEHI LL WDTRNM KQPLADTPVQGGVWR I KWHP FHHHLLIiAA 
CMHSG FK I LN CQ KAM E E RQE AT VLT3HTL PDS LVYG AD WS WLL F 
RS LQRAP S WS FPSNIiGTKTAD LKGAS ELPTP CHECR EDND G EG H 
ARPQSGMKPLTEGMRKNG TWLQATAATTRDCG VNPEEADSAFS L 
LATGS F YDHALHLWE WEGN 


" 6162 


1 


586 


RTIHATGRAGASPMHRIilVWRLAEANKQHVRCQKCXiEFGHWTYE 
CTGKRKYIjHRPSRTABLKKALKEKENKLJLLQQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESBETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKE IELLHS YWTDGLKTLM 


6163 


1081 


785 


R IRSTTEGCAVRLHPTQNTGKAR1MI LLSVS JjGRHWAFTYKFFL 
TP WFVFFFFF FHRKE * VMQKNPMKS REDEWMEKIiNNLHVQRAD 
MNRLI MNYLVTEGFKEAAEKFRMESG IEPSVDLETLDER IKIRE 
MILKGQXQEAIALINSLHPELIiDTNRYLYFHIiQQQHIjIELIRQR 
ETEAALEFAQTQLAEQGEESRECLTEMERTLALLAFDSPEESPF 
GDLLHTMQRQKVXffSEVNQAVLDYENRESTPKLAKLLKLLLWAQNf 
BIiDQKKVKYPKMTDLSKGVIEEPK 


6164 


90 


406 


PCQS PGRS RMRQD KLTGS LRRGGRCL KRQGGG VGT I LSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDAIjKA 
VARLSTGI PKE WRRKVWLTLADHYMS IAIDWDKTMRFT FNERS 
NPDDDSMGIQIVKDIjHRTGCSSYCGQEAEQDRWLKRVLLAYAR 

wnktvgycqgfnilaalilevmegnegdalkimiylidkvlpes 
yfvnnlralsvdmav frdllrmklpelsqhldtlqrtankesgg 

G YEP PLTNVFTMQWFLTL FAT CLPNQTVLKI WDSVFFEGSE III* 
R VSLAI WAKLGEQ I E CCETADE F YS TMGRLTQEMIiENDLLQS HE 
LMQTVYSMAPFPFP QLAELREK YTYNI TP FPATVKPTS VSGRHS 
KARDSDEENDPDDEDAWNAVGCLGPFSGFLAPELQKYQKQIKB 
PNEEQSIiRSNNIAELSPGAINS CRSE YHAAFNSMMMERMTTD I N 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTS1LPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLXTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLI DEQNEAS KTNGLGAA 
EAFPSGCTATAGREGSS PEGS TRRTI EGQS PEPVFGDAD VDVS A 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPEENKAT 
SKAPQGSNSKTP I FS PFPSVKPLRKSATARNLGLYGPTERTPTV 
HF PQMSRS FS KPGGGN SGP* KM VFSSGTMLS RQLPG Y PQE YQRN 
GGERPG 


6165 


90 


405 


PCQSPGRSRMRQDKLTGSLRRGGRCLKRQGGGVUT3LSNVLKKR~ 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNQFQQWYDALKA 
VARLSTGI PKSWRRKVWLTLADHYIjHS I AIDWDKTMRFTFNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRWLKRVLLAYAR 
WNKTVG YCQG FNILAAL I LEVMEGNEGDALK I MI YL IDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
G YEP PLTNVFTMQWFLTLFATCiPNQTVLKI WDS VFFEGSEI IL 
R VS LAIWAKLG EQI E CCETADE FYSTMGRLTQEMLENDLLQ SHE 
LMQTVYSMAPFPFPQLAELREKYTYN I TPFPATVKPTS VSGRHS 
KARDSDEENDPDDEDAWNAVGCLGPFSGFLAPEU2KYQKQIKB 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, I»Isoleucine, K-Lysine, 
L-Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S«=Serine, T=Threonine, V^Valine, 
W=Tryptophan, YaTyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








PNBEQSLRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRIKKKQQQQVHQVYIRABKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIH IPGHTGGKI SPVP YEDLXTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTNGLGAA 
EAF PSGCTATAGREGSS PEGSTRRTI EGQS PEPVPGDADVDVS A 
VQAKLGALELNQRDAAAETEIiRVHP PCQRHCPE P PSAPEENKAT 
SKAPQGSNSKTP I PS PFP S VKPLR KS ATARNLG L YGPTERT PTV 
HFPQMSRSFSKPGGGNSGP * KMVFSSGTMLSRQLPGYPQE YQRN 
GGERFG 


6166 


2 


1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEVKRVLYGKELRK 
LDL PR E AFEAAS REDFELQG YAFE AAEEQLRRPR I VHVGLVQNR 
I PLPANAPVAEQVSALHRR I KAJ VE VAAMCGVN I ICFQEAWTMP 
FAFCTRE KLP WTE FAES AEDGPTTR FCQKIAKNHDM WVS PILE 
RDSEHGDVLWNTAWISNSGAVLGKTRKNHIPRVGDFNESTYYM 
EGNLGHP VFQTQ FGRIAVNI CYGRHHPLNWLMYS INGAE 1 1 FN P 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFGYFYGSSYVAAPDSSRTPGLSRSRDGLLVAKLDIi 
NLCQQVNDVWNFKMTGRYEMYARELAEAVKSNYSPTIVKE*PAS 
VPALG 


6167 


1220 


1844 


YG I VTG PSLCAGDKQPKKQEKNP VLVSPEFVDEA1.CACEE YL5N 
LAHMD I DKDLEAPIj YLTPEGWSLFLQRYYQWHEGAELRHIiDTQ 
VQRCED ILQQLQAWPQ I DMEGDRNI W I VKPGAKSRGRGIMCMD 
HLEEMIjKLVNGNPWMKDGKWWQKYIERPLLIFGTKFDLRQWF 
LVTDWW PLTVW F YRDS Y I R FS TQ PFS LKNLDK* APL YLTPEG WS 
LFLQRYYQWHEGAELRHU>TQVQRCEDILQQLQAWPQIDMEG 
DRNI WI VKPGAKS RGRG IMCMDHLEEMLKLVNGNP WMKDGKWV 
VQKY I ERP LL I FGTKFDIaRQ W FI/VTDWN P LTVWF YRDS Y I R FST 
QPFSLKNIiDK 


6168 


84 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 
ELNELFKP WAAQKI S KG AD PKS WCAFFKQGQ CTKGDKCKFSH 
DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEVVNKKHG 
EAEKKKPKTQI VCKHFLEAI ENNKYGWFWVCPGGGD I CM YRHAL 
PPG FVLKKKKKKKKKEDE I S L* DL IERERSALGPNVTKITLES F 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDE VDDSVS VNDIDLS LYI PRDVD 
ETG ITVAS LER FS TYTSDKDEWKX»S EASGGRAENGERSDLEEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEELNTIjDLEE 


6169 


112 


662 


APAAAMAERPE DLNLPNAV I TRI t KEALPDG VN I S KEARSA I S R 
AASVFVLYATSCANNFAMKGKRKTLNASDVLSAMEEMEFQRFVT 
PLKEALEAYRREQKGKKEASEQKKKDKDKKTDSEEQDKSRDEDN 
DEDEBRLEEEEQNEEEEVDN * KGRETVAP WKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVFITGASRGIGKAIALKAAKDGANIVIA"" 
*ir*.Lt\\jrnr 11 !A/UiiiilJ^VGGKAL»PCIVDVRDEQQISAA 
VEKAIKKFGG I DI LVNNASAI S LTNTLDTPTKRLDLMI4NVNTRG 
T YLAS KAC I P Y LKKS KVAHI PNI SPP LNLNPVW FKQHCGRW * W 
G *GDGLCLI CFELNLCMSDVI TI CT 


6171 


382 


941 


H FMQS D VELiDCDI E P CGHTKF P PTLPLS TTV I VCS CH PVAT AST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAYF PR VLKQ VHQALSLSQEAVSVMD S MVRD I LD 

RIATEAGHLAHYSKCVTITSRDIRMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARI>RREYLYRKAREEAQR " 

SAQERKERLRRAI>EENRIjIPTE1»RREALAI*QGSLEFDDAGGEGV 

TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K=Lysine, 
h- Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R^Arginine, 
S=Serine, ^Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








QRMNRGRHEVGALVRACKANGVTDLIiWHEHRGTPVGLIVSHLP " 
FGPTAYFTLCNVVMRHDI PDLGTMSEAKPHLITHGFSSRLGKRV 
SDILRYIjFPVPKDDSHRVITFANQDDYISPRHHVYKKTDHRNVE 
LTE VGPRFELKL YMI RLGTLEQEATADVEWRWHPYTNTARKRVF 
L S TE * AAPR PLGQLL 


£173 


3 


288 


S VDHR E VQ VLSQS M P LT PHQAVLRGER P YMCVECGKCFGRS SHL 
LQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKA FRVS S DLAQHHK I HTG E KPHE CLEC RKAFTQLS HLI QHQ 
RIHTGERPYVCPLCX3KAFNHSTVLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQRIHTGEKPYTCSECGKAFSDRSVLIQHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
H S HL IQHQK VHRKL * PTCVLSVGSALAGVPTSFS ISVSTLERSP 
MCAVYVGR P S ARAQS L VNTGQ FTQVRS PMS VMS VBKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLALAPLGDAQLVLLRPRRLMNANGRSVARAAELFGL 
TAEEVYLVHDELDKPLGRLiALKLGGSARGHNGVRSCISCLNSNA 
MPRLRVGlGRPAHPEAVQAHVIiGCFSPAEQELLPLLLDRATDLI 
LDHIRERSQGPSLGP*H*WFSKKA | 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPIiRAMAAPVKGNRKQSTEG 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 
AFLVS LYKFMKERHTP I ER VPHLG FKQ INLWKI Y KAVE KLGA YE 
LVTGRRLWKNVYNELGGS PGSTSGATCTRRH Y* RLVLPYVRHLK 
GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERR14DQM 
MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
P EAYKRLLS S F YCKG THG I MS PLAKKKLIiAQ VS KVE ALQ CQE EG 
CRHGAE PQAS PAVHL PE S PQSP KGLTENSRHRLTPQEGLQAPGG 
SLREEAQAGPCPAAPIFKGCFYTHPTEVLKPVSQHPRDFFSRLK 
DGVLLG PPGKEGLS VKE PQLVWGGDANR PS AFH KGGS R KG I UYP 
KPKACWVSPMAKVPAESPTIiPPTFPSSPGLGSKRSLEEEGAAHS 
GKRLRAVSPFLKEADAKKCGAKPAGSGLVSCLLGPALGPVPPEA 
YRGTMLHCPI>NFTGTPG PLKGQAALP FS PLVI PAFPAHFLATAG 
PSPMAAGLMHFPPTS FDSALRHRLCPAGSAWHAPP VTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSAIiRAMAEVHVIGQIIGASGFSESSLFCKWGIHTGAAWKLIiS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDS FGRCQLAG YG FCHVPS S PGTHQLACPTWRPLGSWREQLAR 
AFVGGG PQ LLHGDT I YSGADR YRLHTAAGGTVHLE IGLLLRNFD 
RYGVEC* GTLPPTS PPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VP I ES 1» VGKVHNF PL I AF YCCE KG KRQ PHKS LHDR C FGEALD PN " 
CSHCYLDQIKRSDFLGFSGYSPHFVAISTNSEHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTAAIiAHGCL" 
HCHSNFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 
LHLAX PAKXTREIQjDOVATAVYnMMnOT.V'rt/ , ;j^MVT7D/^vE»n7iTr»T r» 

NIFREQVHLIQNAIIESR1DCQHRCGIFQYETISCNNCTDSHVA 
CFG YNCE SS AQWKS AVQGLLNY INNWH KQ0TSMR PRS SAFS W PG 
THRAAPAFL VLP ALRCLE P PHLANLS LEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRLlGIR"liTL 
PPPKVVDRWNEKRAMFGVYDNriGILGNFEKHPKELIRGPIWLRG 
WKGNELQRCIRKRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR * KRKLRTS EKAH LS P WRRETVLFP VRKRIjCI FS VI KWGFFG I 


6180 


156 


1833 


DHH I L KAAS TTHVCARGN I FAI PN TKCL K C * ATATP S S L>ECQN * 
SHLSLCPLPATTSGLTPNSMI PEKERQNIAERLLRVMCADLGAL 
SWSGKEFLKIjAQTLVT3SGARYGAFSVTEIIX3NFNTIiALKHIiPR 
MYNQVKVKVTCALGSNACLG I GVTCHS QS VGPDS CY I LTA YQAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=*Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=>Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNH I KS YVLG VKGAD I RDSGDLVHHWVQN VLSEFVMSE IRTVYV "~ 
TDCRVSTSAFSKAGMCIjRCSACALNSVVQSVLSKRTLQARSMHE 

viellnvcedlagstglaketpgsleetspppcwwsvtdslllv 
heryeqicefysrakkmnliqslnkhllsnlaailtpvkqavie 
l s nesqp tlq lvlp t yvrlekl ftakandagtvs kl chlfl eal 
kenfkvhpahfcvami ldpqqklr pvp pyqhee i igkvceline v 
keswaebadfepaakkprsaavenpaaqeddrlgknevydylqe 
plfqatpdlfqywscvtqkhtkliaklafwliiavpavgarsgcvn 
mceqalli krrrlks pedmnklmflksnml 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPIaALPAWLQPRYRKNAYLFI 
YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSLLELLHIYVGIESNHLLPRFLQLTERIIILFWITSQEBVQE 
KYWCVLFVFWNLLDMVRYTYS MLS VIGI S YAVLTWLSQTLWMP 
IYPLCVJQAEAFAIYQSLPYFESFGTYSTKLPFDLSIYFPYVLKI 
YLMMLiF I GM YFTYS HliYS ERRDI LGI F P I KKKKM * S TAFQCDTR 
KDRLWIQCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS* IDYQLNTIiLKEFQLTEENTKLRYLTdSLIEDMAAAYFPDCI " 
VRP FG S S VNTFGKLGCDLDM FLDLDETRNLS AHKI SGNFLME FQ 
VKNVPSERIATQKII^VXXSECLDIIFGPGCVGVQKIIiNARCPIjVR 
FSHQAS G FQCDLTTNNR I AXjTSS ELLYIYGAIiDS RVRALVFSVR 

cwarahsltssipgawitnfsltmmvifflqrrsppilptldsl 
ktladaedkcvi egnnctfvrdlsr i kpsqntetlelllkeffe 
yfgnfafdkns inirqgrsqnkpdssply iqnpfetsln i sknv 
sqsqlqkfvdltares aw i lqqedtdrps i ssnr pwglvslllps 

APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 
STQT 


6183 


1118 


452 


HLDR Y 1 KS PG S GS S TPAP P SHLLL YIjLHPQS TRTMGCCGCS RGC 
GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 

CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCI,VKAFLM 

VP | 


6184 


1 


1>191 


IVTVREEDGAPAVAPPGWVSRANKRSGAGPGGSGGGGARGAEE " 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRIITS 
ELYRSLGDVLRDVDAKAI>VRSDFI.LVYGDVrSNrNXTRAIiEEHR 
LRRKL * KNVS VMTMI FKES S PSHPTRCHEDNVWAVDSTTNRVL 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHIS ICS PQVA 
QLFTDNFDYQTRDDFVRGLLVNEEI LGNOIHMHVTAKE YGARVS 
NLHMYSAVCADVI RRWVYPLTPEANFTDSTTQSCTHSRHNI YRG 
P E V S LG HGS I LEENVLLGS GTVIG SN CF ITNS VI GPGCH I E PGD 
NWLDQTYLWQG VRVAAGAQ I HQSLL CDNAEVKER VTLKPR S VL» 
TSQWVG PN I TLP EGSV I S LH PPDAE EDEDDGE FS DDSGADQE K 
DKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELQQNLWGLKI 
wnoiiiiQjioiiiOJawoMiJor.c.i'lJSRGGS PQMDDIKVFQNEVLGTLQR 
GKEENI S CDNL VLE INS LKYAYNI S LKEVMQVLSHWLE FPLQQ 
MDSPLDSSRYCALLIiPLLKAWS P VFRN YI KRAADHLEALAAI ED 
F FLEH E ALG I SMAKVLMAF YQLE I LAE ET I LS WFS QRDTTD KGQ 
QLRKNQQLQRFIQWIiKEAEEESSEDD 


6185 
6186 ~f~ 


791 
5^9 


44 
238 


PCTSCVLWATLHLPASTRKAPQAECGMIS ITEWQKIGVGITCFG ' 

IFFILFGTLL.YFDSVLIAFGNLLFLTGLSLIIGLRKTFWFFFQR 

HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 

GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWIxKGAK 

REEWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 

GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 

VYGIDSSNTNTHGAEERNRKLKKHWKLCmQSRLDVNGLALKMA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correeponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Qlycine, 
HsHistidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=*Proline, Q=Glutaraine, R=Arginine, 
S*=Serine, T= Threonine, V=Valine, 

WssTrvotoDhan . Y=Tvrosine Y=TTf»Wio»rfrt *■ . . 
Codon, /^possible nucleotide deletion, 
\*possibLe nucleotide insertion) 








KERKVKNKVKNKADTEEVFNNSPTNQEKMPTSAIBPDFSGSVIS 
NIRNQMETLHSQPHQEENLCFENSFSLINLLPINAVEPTSSQQI 
PNRETSEANKERRKMTSXSSESNI YSPLTSFITADSELHDI IKD 
LEDCLMVGLHTCGDLAPNTLR I FTSNSEI KGVCSVGCCYHLLSE 
E FENQH KE RTQE KWG F PMCH YLKEERWCCG RNARMS ACLALER V 
rtrtsjvoiur i r.£>Lir IKAVbUUJ. J.KOC1GITKCDRHVGKIYSKCSSF 
LDYVRRSLKKLGLDESKLPEKI IMNYYEKYKPRMNELEAFNMLK 
WLAPCIETLILIiDRbCYLKEQEDIAWSALVKLFDPVKSPRCYA 
VIALKKQQ* FPLKQI IRCISL *DSAGCAEEVSVGDGGPAIjRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMEPKASCPA 
AAPLMERKFHVLVG VTGS VAALKLPLliVS KLLDI PGlrE VAWT T 
ERAKHFYSPQDIPVTLYSDADEWEMWKSRSDPVLHIDLRRWADL 
LLVAPLDANTLGKVASG ICDNLLTCVMRAV7DRS KPLLFCPAMNT 
AMWEHP I TAQQVDQLKAFGYVE I PCVAKKLVCGDEGLGAMAEVG 
a x vwis.v^vijFQHSGFQQS*PGISVMGVPIjYSEWVQAKSVKMDV 
GK I GG YPH LLNGG PALS L PRGQACS RLNWTEG PGLS F FQPGEAA 
A 


6188 


238 


1534 


KGFV^AGPLMAELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
NIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCMQEMG 
NGKA3MRLYEAYLPETFRRPQIDPAVEGFIRDKYEKKKYMDRSLD 
1NAFRKEKDDKWKRGSEPVPEKKLEPWFBKVKMPQKKEDPQLP 
RKSS PKSTAP VMDLLGLDAPVACS IANSKTSNTLEKDLDLLASV 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QLSKDSILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
fJN£> j. Mi*^MMPPPV(^VAQPGASGMVAPMAMPAGY>K3GMQASMMG 
V PNGMMTTQQAG YMAGMAAMPQTVT G VQPAQQLQWNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLSPQMWK 


6189 


1297 


793 


LGEPLGDLCELIPaDVQQLQMGEVHPGTGAQGSAAQSVAGEVQL 
TQLSHARQR PS CQQS QLIALDLQHMD I SRQPR WQHVQP VARQVQ 
RAQQAQLAEGVAVHLWAGDAWAEVELLQEVGGGKVFAANACDL 
WQDH EGA^IAARQATGHALQRVI VQVRR VQPLEAL*R VP SGLPR 
RVRAFMILHNQITGIGREDFATTYFLEELNLSYNRITSPQVHRD 
AFRKLRLLRSLDLSGNRiHMLPPGLPRNVHVLKVKRNBLAALiAR 
GALAGMAQLREL YLT SNRLRS RALGPRAWVDLAHLQLLD I AGNQ 
LTEIPEGLPESLEYLYLONNKT^AVPATjaimcToisTr vr^wr aokt 

KlaAVGSVVDSAFRRLKHLQVLDIEGNLEFGDISKDRGRLGKEKE 
EEEEDEVEEEETR 


6190 


66 


1309 


I LVGNVS FLLS FAE YVCNCS WGS LNVNR rwn Trr: nr- tr r-ppz-vVS — 

GLHCETCKEGFYLNYTSGLCQPCDCSPHGALSIPCNSSGKCQCK 
VGVIGS ICDRCQDGYYGFS KNGCLPCQCNNRSAS CDALTGACLN 
CQENS KGNHCEECKEGFYQSPDATKECLRCPCS AVTSTGS CS I K 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDSICRKCQCHGHVY 
P VKTP KICKPESGE CINCLHNTTG FW CENCL * G YVHDLEGNCI K 
KVILPTPEGSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENSTS ALADVS WTQFNI I ILTVI 1 1 VWLLMGFVGAVYM YRE 

YQNRKIjNAPFWTIELKEDNISFSSYHDSIPNADVSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 " 


VI^CHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS"~ 

MIDWIKKI^IYTMEYYATIKRNEIMFFAGTWMEMEAIILSKLM 
QDYMFSLISGS 


6192 


3 


950 

1 
< 


TRGCGNKMAGKKWLSSl^VYAEDSEPESDGEAGll^VGSAAEE- 
KGGLVS DAYGEDD FS RI/3GDEDG YEEE EDEMSRQS EDDDS ETEK 
PEADDPKDNTEAE KRDPQ ELVAS FSERVRNMS PDE I KI P PE P PG 
^CSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEFRNPS I YEKL I 
JFCAIDEl^TNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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S3Q 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
I sequence 


{A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid f F« Phenyl alanine, G^Glycine, 
HoHistidine, Iwlsoleucine, K=»Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P« Proline, Q=Glutamine, R^Arginine, 
5=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKIEFVTQTKKGTTTNATSTTTTTASTAVADAQKRKSKW' 
DS AX P VTTIAQPTILTTTATLPAWTVTTSASGS KTT VI SAVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGN KMAGKKNVLS SLAVYAED S SPESDG BAG I EAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYE3BEDEMSRQSEDDDS3TEK 
P E ADDPKDNTEAEKRDPQELVAS FS ERVRNMS PDE I K I P P E P PG 

RCSNHLODKIOKLYER K T Y FfJMnMNVT TncVVCroMn e? <r via vr t 

Q FCAI DELGTUYP KDM FD PHGWSEDS Y YEAIiAKAQ K I EMD KLEK 
AKKERTKIEFVTGTKKGTTTNAT5TTTTTASTAVADAUKRKSKW 
DS A I PVTT I AQPT ILTTTATLPAVVT VTTSASGS KTTV I SAVGT 
IVKKAKQ 


6194 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDS EPESDGEAG I EAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFS ERVRNMS PDE IKI PPE PPG 
RCSNHLQDKIQKLiYERKIKEGMDMNYIIQRKKEFRNPSIYEKIil 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTK I E FVTGTKKGTTTNATS TTTTTASTAVADAQKRXS KW 

IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDSPSVRKTHCSGRKHKEIWKD " ' 
YYQKWMEEQAQS L I DKTTAAFQQGKI P P TPFS AP P PAGAM I P P P 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMM PG P PMMRP PAR PMMVP TRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNIIjDNAEQVISNLEARNLGPRLTPLLQEEDSH 

HIVBTNWRKKNLHSWVLHFNSRGSAAEFAVFHlMTRlLEATNSIi 
FLPLPPGFHTJuHTILGVQCLPLHNLLHCHJSGVLLLTETAVIRL 
MKDLDNTEKNEKIi KFS 1 1 VRL P PLIGQKI CRLWDH PMS SN 1 1 SR 

NHVTRLLQNYKKQPRNSMINKSSFSVEFIjpLNYFIEILTDIESS 
NQALYPFEGHDNVDAEFVEEAALKHTAMLLGL 


6197 


3 | 


819 


ADPEGTE 3AVMS RYTRPPNTS LFIRNVADATRPEDLRREFGR YG 

P I VDVYI PLDFYTRR PRGFAYVQFEDVRJDAEDALYNLNRKWVCG 

RQIE I QFAQGDRKTPGQMKS KERHPCS PSDHRRSRS PSQRRTRS 

RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 

SRTPRRN?GSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 

RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


HI 


1912 


SEAALSPSFISPACFLIiRKLPALEDGTLPHPDTLGMWYEGARSE " 
RENHAADDSEGGALDMCCi?ERIiPGLPOPIVMFaT.nT?aj?r«rrirkcrk 

REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHIWSQNATNLVSSLLTLLKQLEPTAWLDSGTW 
GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEP PTPLP PEDRRQSV 
SRQPSFTYSEMMEEKIEDDFIiDLDPVPETPVFDCVMDIKPEADP 
TSIiTVKSMGLQERRGSNVSLTIiDMCTPGCNEEGFGYIiMSPRBES 
AREYTJLSASRVLQAEELHEKALDPFLLQAEFFEIPMNFVDPKEY 
D I PGLVRKNR YKT I L PNPHS RVCLTS PD PDD PLSS Y I WANYIRG 

YGGEEKVYIATQGPIVSTVADFWRMVWQEHTPIIVMITNtEEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVBEAAQQEGPHCAPII 
VHCSAGIGRTGCF I ATS I CCQQLRQEGWD I LKTTCQLRQDRGG 
M I QHCEQ YQFVHHVMSLYEKQLSHQS PE 


6199 


144 


1211 


MAREWGESS&SWKKQAEDIKKIFEFKETI^TGAFSEVVLAEEKA 
TOKLFAVKCIPKKALKGKESSIENEIAVLRKIKHENIVALEDIY 
BSPNHLYLVMQLVSGGELFDRIVEKGFYTEKDASTLIRQVLDAV 
yYLHRMGIVHRDLKPENLLYYSQDEESKIMISDFGLSKMEGKGD 
w^STAC^TPGWAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o r r e spo nding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G~Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
Ij=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, ^Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=poseible nucleotide insertion) 








F YDENDS KLFEQ I LKAE YE FDS P YWDD I SDSAKD F I RNLM E KOP — 

NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 

AFNATAVVRHMRKI^LGSSLDSSNASVSSSLSIxASQJC^CASG'T'F 
HAL* 


6200 


702 


96 


IiPEVPHSLRPRVKPHLC!ClAOPAVT?\/MaPT.PVT.a\7l?nT nv«rr rTrnx? — 
W VDTHVD PP FHKS S DGTVRDRRGQDVRLYP E VPE VLKRLQS LG V 
PGAAASRTSEIEGANQDLELFDLFRYFVHREIYPGSKITHFERL 
QQKTG I P FS QMI FFDDE RRN I VD VS KLG VTC I H I QNGMNLQTLS 
QGLETFAKAQTGPLRSSLEBSPFEA 


6201 


2809 


2383 


GQTPRVRWKMRRSLRAGKRRQTAGRXSKSPPKVP1VIQDDSLPA 
GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSLAQREAEYAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE 
DG I FDSGNFEQFLREKVKVKGKTGNLGN WHI KRFKNK IT WSE 
KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEDESESED 


6203 
T204 


419 


2550 


RC PR PPATAGAAASRP DRS P PSG I SGS E AAAGAGAAAPASQH PA " 
TGTGAVQTEAMKQILGVIDKKLRNLEKKKGKLDDYQERMNKGER 
LNQDQLDAVSKYQEVTIWLEFAKEMRSFMALSQDIQKTIKKTA 
RREQLMREEAEQKRLKTVLELQYVLDKIiGDDEVRTDIjKQGIjNGV 
* AxioiSixvajjoJUuLfi^r I iS.Lt V OPJSRLTOSIjRIjNBQYEHAS IHLWDUjE 
G KE KPVCGTT YKVJLrKE I VERV FQS NY FDSTHNHQNGLCE ESE AA 

SAPAVEDQVPEAEPE PAEE yteqs evesteyvnrqfmaetqfts 

G E KEQVDEWT VETVB WNS LQQQPQAAS PS VPE PHSLTP VAQ AD 

plvrrqrvqdi^maqmqgpynfiqdsmldfenqtldpaivsaqpm 
nptqnmdmpqlvcppvhsesrlaq pnqvpvqpeatqvp lvssts 
egytasqplyqpshateqrpqkbp i dqiqatislntdqttasss 
lpaasqpqvfqagtskplhssginvnaapfqsmqtvfkmnapvp 

P^EPETLKQQNQYQASYNQSPSSQPHQVEQTEhQQEQhQTVVG 

tyhgspdqshqvtgnkqqppqqntgfprsnqpyynsrgvsrggs 
rgarglmngyrgpangfrggydgyrpsfsntpnsgytqsqfsap 
rdysgyqrdgyqqnfkrgsgqsgprgaprgrggpprpnrgmpqm 

NTQQVN 




2933 


787 


CTHNL I SLLGGRALIHFNRFLNLK I QEGEAHNI FCPAYDCFQL V 
PGDI IKS WSKEI4DKRYLQFDIKAFVENNPAIKWCPrPGCDRAV 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHIiFCWECLGEAHEP 

CDCQTWKNWLQKITEMKPEELVGVSEAYEDAANCLWLLTNSKPC 

ANCKS P I QKNEG CNHMOCAKC KYDFCW I CLEE W lev u Q ttwxj w t?t/ t 

YRCTR YE VIQHVE EQS KEMTVS AE KKHKRFQE LDR FMHY YTRFK 

NHEHSYQLEQRLLKTAKEKMEQI5RALKETEGGCPDTTFIEDAV 

HVLLKTRR I LKCS YP YGPFLE PKSTKKE I FELMQTDLEMVTEDL 

AQKVNRP YLRTPRHKI IKAACIjVQQKRQEFLAS VARGVAPADSP 

EAPRRSFAGGTWDWEYLGFASPEEYAEFQYRRRHRQRRRGDVHS 

LLSNPPDPDEPSESTLDIPSGGSSSRRPGTSWSSASMSVLHSS 

SLRDYTPASRSENQDSIiQALSSLDEDDPNILliAIQLSLQESGLA 

LDEETRDFI^SNEASIjGAIGTSLPSRiDSVPRNTDSPRAALSSSE 

LLELGDSLMRLGAENDPFSTDTLSSKPLSEARSDFCPSSSDPDS 

AGQDPNINDNLLGNI^1AWFHDMNPQSIALIPPATTEISADSQLP 

CI KDGSEGVKDVELVLPEDSMFEDAS VS EGRGTQ1 EENPLEENI 

PGGGKQHPQAW 


6205 


1 


1200 


RAHRGKMAljEVGDMEDGQLSDSDSDKTVAPSDKPLQLPKVLGGD 
S AMRAFQNTATACAPVSHYRAVES VDS SEES FSDS DDDS CLWKR 
KRQKCFNPPPKPEPFQFGQSSQKPPVAGGKKINNIWGAVLQEQN 
QDAVATELGIU^EGTIDRSRQSETYOTLIjAKKLRKESQEHTKD 
LDKELDEYMHGGKKMGSKEEENGQGHLKRKRPVKDRLGNRPEMN 
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SEQ 
ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6206 



10 



6207 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1442 



2924 



6208 



2924 



6209 



1758 



6210 



3761 



1471 



1471 



829 



TOT 



Amino acid segment containing signal peptide'" 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine , K= Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q^Glutamine, R»Arginine, 
S=Ser ine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=pcasible nucleotide i nsertion) 
Y KGR YE I TAEDSQE K VADE I S FRLQE PKKDL I AK VVR 1 1 GNKKA 
ISLIiMETAE VEQNGGLFIMNGSRRRTPGGVFLNLLKNTPS IS EE 
QIKDIFYI ENQKE YENKKAARKRRTQVLGKKMKQAI KSLNFQED 

DDTSRETFASDTNEALASLDE SQEGHAEAKLEAEEAI EVDHSHD 
I*DIF 

I I5ERRERS CLHLVC IRCS CDWEMG S VL GLCSMAS W I PCLCGS 
APCLLCRCCPSGNNSTVTRLIYALFIiLVGVCVACVMLIPGMEEO 
IiNKI PGFCENEKGWPCNI LVGYKAV YRLCFGLAMFYLLLSLLM 
I KVKS S S DPRAAVHNGFWFFKFAAAIAI I IGAFFI PEGTFTTVW 
F YVGMAGAFC FI I>I QLVLLI D FAHSKNES W VEKME EGNSRCWYA 
ALLSATALNYLLSLVAIVLFFVYYTHPASCSENKAFISVNMLLC 
VG AS VMS I L P KIQE S Q PRSGLLQS SV I TVYTM YLTW SAMTNE P E 
TNCNPSLLSIIGYNTTSTVPKEGQSVQWWHAQGIIGLILFLLCV 
FYSSIRTSNNSQVNKLTLTSDESTLIBDGGARSDGSliEDGDDVH 
RAVDNERDGVTYSYSFFHFKLFIiASLYIMMTIiTNWYRYEPSREM 
KSQWTAVWVKISSS W I GI VI, Y VWTLVA P LVLTNRDFD 
T VMAEAATPG TTATTS GAGAAAATAAAASPTPI PTVTAPSLGAG 
GGGGGS DGS GGG WTKQ VTCR Y FMHGVCKEGDNCR YSH DLSDS P Y 
SWCKYFQRGYCIYGDRCRYEHSXPLKQEEATATELTTKSSIiAA 
SSS LSS 1 VGPLVEMNTGEAES RNSNFATVGAGSEDW VNAIEFVP 
GQPYCGRTAPSCTBAPLQGSVTKE3SEKEQTAVETKKQLCPYAA 
VGECRYGENCVYIiHGDSCDMCGLQVLHPMDAAQRSQH IKSCIEA 
HEKDMELS FAVQRS KDMVCG 1 CMEWYEKANPSERRFGILSNCN 
HTY CLKCI R KWRS AKQ FES K 1 1 KS CPE CR I TSNFV1 PS E YWVBE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRRE EPQR QKVGTS SR YRAQRRNH FWEL IEER ENSNP FDNDEEE 
WTF5LGEMLLMLLAAGGDDELTDSEDEWPLFHDEXiEDF YDLDI> 
T VMAEAATPGTTATTSGAGAAAATAAAAS PTPIPTVTAPSLGAG 
GGGGGS DGSGGGWTKQVTCRYFMHGVCKEGDNCRY5HDLSDSPY 
S WCKYFQRGYCI YGDRCR YEHS K PLKQBEATATELTTKSSIAA 
SSShSS I VG PbVEMNTG E AES R NSNFAT VGAGS EDW VNAI EFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGECR YGENCVYLHGDS CDMCGLQVLHPMDAAQRSQHI KSCI EA 
HE KDMELS FAVQRS KDMVCG 1 CME WYEKANPSERRFG I LSNCN 
HTYCIjKCIRKWRSAKQFESKI I KSCPECR I TSNFVI PSE YWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNH^ELIEERENSNPFDNDEEE 
WTFELGEMLLMLIAAGGDDELTDSEDEWDL FHDELEDFYDLDL 
ERLCFPCMQS KI YS YMSPNKCSGMRFP LQEBNS VTHHE VKCQGK " 
PLAG I YRKREEKRNAGNAVRSAMKSEEQKI KDARKGPLVP FPNQ 
KS BAAE PPKTP PSS CDSTNAAIAKQAIiKKP I KGKQAPRKKAOGK 
TQQNRKLTDFYPVRRSSRKSECAELQSEERICRIDELIESGKEEGM 
KIDLIDGKGRGVXATKQFSRGDFWEYHGDLIEITDAKKREAIiY 
AQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 
TKLHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPML 



IFGMSKLRMVLLEDSGSADFRRHFVNLSPFTITVVLLLSACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S VICNQLGCPTAI KAPGWANS SAGSGRI WMDHVS CRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
IKFQGRWGTVCDDNFNIDHAS VICRQLE CGSAVS FSGSSNFGEG 
SGP I WFDDLjCCNGNESALWWCKHCGWGKHNCDHAEDAGVI CS KG 
ADLSLRLVDG VTECSGRLE VR FQGE WGTICDDG WDS YDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYOraNEDAGVTCSDGSDLEIiRLRGGGSRCAGTVEVEIQRL 
LGKVCDRG WGLKEAD WCRQLGCGSAIiKTS YQVYS K I QATNT W L 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


Predicted end - 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H^Histidine, I»Xsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLSSCNGNBTSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV" - 
GGDIPCSGRVEVKHGDTWGS1CDSDPSLEAASVLCRELQCGTW 
S I LGGAH FGEGNGQ I WAEE FQ CEGHE SH LSLCPVAPRPEGTCSH 
SRDVGWCSRYTEIRLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCQQLKCGVALSTPGGAR FGKGNGQI WRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTI PBESAVACI ESGQLRLVNGGGRCAGRVEI YHEGS WGTI CD 
D S WDLS DAHWCRQIiGCGEAINATGS AHFGEGTG P I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEPMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGVVCRQLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRIiASPSEET 
W ITCDNKI RLQEG PTS CSGRVEI WHGGSWGTVCDDS WDLDDAQV 
VCQQ LG CG P ALKAFKEAEFGQGTG ? I WuNE VKC KGNES S LWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VGIL.GWLI1AIFVALFFLTKKRRQRQRLAVSSRGENI1VHQIQYR 
EMNS CLNADDLDIiMNS S GGHSE PH 


6211 

• 


3761 


387 


I FGMSKLRM VLLEDSGS ADFRRHFVNLS PFT1 TVVLLLSACF VT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S VI CNQLGC PTAI KAPG WANS S AGSGRI WMDIIVSCRGNESAIiWD 
CKHDG WGKHSNCTHQQDAG VTCSDGSNLEMRIjTRGGNMCSGR I E 
I KFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 

sgpiwfddlicngnesalwnckhqgwgkhncdhaedagvicskg 
aplslrlvdgvtecsgrlevrfqgewgticddgwdsydaavack 
olgcptavtaigrvnaskgfghiwi^dsvscqghepavwqckhhe 
wgkhycnhnedagvtcsdgsdlelrlrgggsrcagtveveiori, 

LGKVCDRGWGLKEADWCRQIX3CX3SAiKTSYQVYSKIQATNTWL 
FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDI PCSGRVEVKHGDTWGS ICDSDFSIiEAASVLCRELQCGT W 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVG WCSR YTE I RLVNGKTPCEGRVELKTLGAWGSIiCNSHWD 
IEDAHVLCQQLKCGVAI1STPGGARFG KGNGQI WRHMFHCTGTEQ 
HMGDCPVTAXiGAS LCPSEQVAS VI CSGNQSQTLSS CNSSSLGPT 
RPT I P EESAVACI ESGQLRLVNGGGRCAGRVB I YHEGS WGTI CD 
DS WDL SDAHW CRQLGCGEA INATGS AH FGEGTG P I WL DE MKCN 
GKES R I WQCHSHG WGQQNCRHKEDAG VI CS E FMS LRIjTS EASRE 
ACAGRLE VF YWGAWGTVGKS SMSETTVGVVCRQLG CADKGK INP 
ASLDKAMSIPMWVDNVQCPKGPDTIjWQCPSSPWEKRIASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGG5WGTVCDDSWDLDDAQV 

vcqqlgcgpalkafkeaefgqgtgpiwlnevkckgnesslwdcp 
arrwghsecghkedaavnctdi s vqktpokattgrssrqss fl a 

VGILGWLtAI FVAIjFFLTKKRRQRQRIiAVSSRGENLVHQIQ YR 
EMNSCLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


lkwelrpggavwgtgrgagtgaprscccqtnpgppsslrrafr'r 
relpfpacheiglgaeagsgpppapaaresrsrameeeasspgl 

GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTKGFHMTWS VKLDEHI IPLGSMAINS I 
S KLTQLTQS SM YSL PNAPTLADLEDDTHEASDDQPE KPHFDSRS 
VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRAIiYWHFLT 
DTFTA Y YRLLITHLGLPQWQYAFTS YG IS PQAKQRVSMYKP I TY 

NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPS TSSTSKSSSGSGNPTRK 


6213 


1 


1134 


IjK WEIiRPGG A VWGTGRGAGTGAP R S CCCQTNPGP PS S LRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCS KPHL E KLTLG I TR ILESS PGVTE VT 1 1 E KPPAERHM I S S WE 
QKNNCVMPEDVKNFYLMTNGFHMTWSVKLDEHIIPLGSMAINS1 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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1 SEQ 
ID 
NO: 


Predicted ' 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 

nucleotide 

location 

co r r espondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, DsAsDartie Aei<3 t?- 
Glutamic Acid, F~Phenyl alanine, G=*Glycine, 
H~Histidine, I-Isoleucine, K^Lysine, 
I*=Leucine , M=Methionine , N*=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S-Serine, T=*Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








VI FELDS CNGSGKVCL V Y KSGKPAXAE DTE I W FI»DRAIi YWHFLT 
D TFTAYYRX»L I THLGL PQ WQYAFTS YGISPQAKQR VSM Y KP I T Y 
NTNLLTEETDS FVNKLDPS KVFKS KNKTVI PKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HEI1APSAIRRAARLGLGPARWQSRAAAFV FVRGFRTGWS FVGWV 
VLGTS AKRTRL F FFLS KMAAS SRAQVIAL YRAMLRES KR FS AYN 
YRTYAVRR I RDAFRENKNVKDPVE I QTLVNKAKRDLGVIRRQVH 
IGQbYSTDKLI I ENRDMPRT 


6215 


2 


1849 


FVAGG PRGSGSAAETMP E I RVTPLGAGQDVGRS C I LVS IAGKNV 

CGALPYFSEMVGYDGPIYMTHPTQAICPILLEDYRKIAVDKKGE 
ANFFTSQMI KD CM KKWA VH L HQ T VQ VDDELE I KAYYAGHVLGA 
AMFQIKVGSESVVYTGDYWMTDniSUT.rtlVAWTT>voDE>MT t -n»ie»orr> 

YATTIRDSKRCRERDFIiKKVHETVERGGKVLIPVFALGRAQELC 
ILLETFWERNINLKVPIYFSTGLTEKANHYYKLFIPWTNQKIRKT 
FVQRNM FEFKHI KAFDRAFADNPGPMWFATPGMLHAGQS LQIF 
RKWAGNEKNMVIMPGYCVQGTVGHKXLSGQRKLEMEGRQVLEVK 
MOVE YMSFSAHADAKGIMQ LVGQAE PES VLL VHGEAKKME FLKQ 
KIEQELRVNCYMPANGETVTLPTSPSIPVGISLGLLKRBMAQGL 
LPEAKXPRLLHGTLIMKDSNFRIiVSSEQAUKELGLAEHQLRFTC 
RVHLHDTRKEQETAIiRVYSHIiKSVLKDHCVQHLPDGSVTVESVL 

LOAAAPS ED PGTKMX ■T.V.QUTVnnPPT .CI C T?t .t e t .t . iwrir nrt •» r-.« 


6216 


11 


393 


QTTRPEPRNSAt.RQSRSKMAWGVSSVSRLIiGRSRPQLGRPMSS 
GAHGEEGSARMWKTLTFFVAIiPGVAVSMLNVYLKSHHGEHERPE 
FIAYPHliRIRTKPFPWGDf5NHTr.F , HNPWVXrt>T.D tt> v cni? 


6217 


9 


1178 


TRVGRGESGIiKWEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 
t»R KI« F IGGLS FETTDDSLREHFEKWGTLTDCWMRDPQTKR s RG 
FGFVTYS CVEEVDAAMCARPHKVDGRWEPKRAVSREDSVKPGA 
HLTVKKI FVGGIKEDTEEYNLRDYFEKYGKIETIEVMEDRQSGJC 
KRG PAFVTFDDHDTVDK I WQ KYHTI NGHN"CEVKKALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGSYGGGDGGYNGFGGDGGNYGGGPGYSSRGGYfV3Rr5T>r , Vf-« 

NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGQQQS 
NYGPMKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


6218 


1305 


906 


S CERRGF I MADDLKR FLiYKKL PS VEGULA.I WS DRDGVPVI KVA " 
NDNAPEHALRPGFLS TFALATDQGSKliGLS KNKS 1 1 CYYNT YQV 
VQFNRLPLWS FIAS SSANTGL I VSLEKELAPLFEELRQWEVS 


j 6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPEADRPHQRPFL"' 
IGVSGGTASGKSTVCEKIMELLGQNEVEQRQRKWILSQDRFYK 
VLTAEQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
T YDFVTHS RLPETTVVYPAD VVLFEG I I*VF YS QE I RDM FHLRL»F 
VDTDS D VRI/SRR VLRD VRRGRDLEQILTQ YT rFVKPAFEEFCL P 
TKKYADVIIPRGVDNMVAINLIVQHIQDILNGDICKWHRGGSNG 
RS YKRTFS E PGDHPGMLTSGKRSHIjES S SRPH 


6220 


22 7 


764 


EQNISLEMSCTIEKAbADAKALVERLRDHDDAAESLIEQTTALN 
KRVEAMKQ YQElE IQELNEVARHRPRSTIjVMG IQQENRQ IRELQQ 
ENKELRTS LEEHQSALELIMSKYREQMFRLLMASKKDDPGI IMK 
L^QHSKIDMVHRWKSEGFFLDASRHILEAPQHGLERRHLEANQ 


6221 


98 


916 


RWIWDLNPVSDGLELRPKYNGILHCLTTIWKLDGLRGLYQGVTP 
NIWGAGLSWGLYFVFYNAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLC I TN PLW VTKTRLMLQ YDAVVNS PHRQ YKGM FDTLA/K I Y K 
YEG VRG L YKG FVPGLFGTSHGALQ FMAYELLKLKYKQHI NPJJP E 
ft.QLS TVE YI S VAALS K I FAVAAT Y P YQ WRARLQDQHMFY SGVI 
DVITKTWRKEGVGGFYKGIAPNI.IRVTPACCITFWYENVSHFI, 
LDLREKRK 
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SEQ 
ID 
NO: 

£222 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, C-Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenyl alanine, G=Glycin e , 
H«Histidine, I=»Isoleucine, K=Lysine, 
L^iieucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S *Ser ine, T« Threonine, V=Valine, 
W= Tryptophan, ^Tyrosine, X=Unknovro, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 




2 


2116 


MARELRALI^LWGRRLiRPIiljRAPAIjAAVPGG KP I LCPRRTTAQLG 

PRRNPAWSIjQAGRJjFSTQTAEDKEEPLHSIISSTBSVQGSTSKH 

EFQAETKKIjLDIVARSLYSEKEVFIRBLISNASDALEKLRHKLV 
SDGOALPEMRTHT.OT'MttPlfr^TTTTr^n'Pr'T/tiui^rtotsT tiam* __ _ _ 
»^jjv»«i-Mj*-iii*jcixfiij^iiMi^Cift,jj r< i ± i ±UJJ 1«*(jMXQEELiVSNIjGTIA 

rsgs kafldalqnqaeasski igqfgvgfysafmvadrve vysr 
saapgslgyqwlsdgsgvfeiaeasgvrtgtkiiihlksdckef 

SSEARVRDWTKYSNFVSFPLYLNGRRMNTLQAIWMMDPKDVRE 
WQHEE F YR YVAQAHDKPR YTLH YKTDAP LNI RS I F YVPDMKPSM 
FDVSRELGSSVAIjYSRKVtilQTKATDILPKWLRFIRGVVDSEDI 
PLNLSRELLOESALIRKLRDVLQQRLIKFFIDQSKKDAEKYAKF 
FED YGLFMR3GI VTATEQE VKED I AKLLRYES S ALPSGQLTS LS 
EYAS RMRAGTRNI YYLCAPNRHLAEHS PYYEAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLISVETD1WDHYKEEKFEDRSPAAE 
CLSEKETEELMAWMRNVI^SRVTITOCVTLRLDTHPAMVTVJJEMG 
AARHFLRMQQLAKTQEERAQLLQP TLE INPRHALIKKLNQLRAS 
E PGIiAQLLVDQI YENAMIAAGLVDDPRAMVGRIjNEI^VKALERH 


6223 
6224 


3 
1 


715 
133 


DAWARTMAGMVDFQDEEQVKSFLENMEVECNYHCYHEKDPDGCY ~ 
RLVDYLEG I RKNFDEAAKVLKFNCEENQHSDSCYKLGAYYVTGK 
GGLTQDLKAAARCFLMACEKPGKKSIAACHNVGLLAHDGQVNBD 
GQ PDLG KARD Y YTRACDGG YTS S CFNIjSAMFLQGAPG FPKDMDL 
ACKYSMKACDLGHIWACANASRMYKX.GDGVDKVEAKAEVLKNRA 
QQVHKEQQKGVQPLTFG 


6225 


3259 


938 


lrtissmawgpllltllahctgswaqsvltqppsvsgariphek 

IjLSCHRI^ICKIjPFSVESRKTVMGPQGiARRQAFIiAFGDVTVDFT 
OKEWRIiLSPAQRALYREVTLENYSHLVSLGILHSKPELIRRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGLAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTEI DKVLKG I ENS R WGA FKCAERG QDFSRKMM VI IH 
KKAHSRQKIjFTCRECHQ/3FRDESALLIiHQNTHTGEKSYVCSVCG 
RGFS IiKANLLRHQR THSGE KP FLCKVGGRG YTS KS YLTVHERTH 
xvxcxv. ^^^WfivvjcCitriYUJ^bYNKHIjrCAHS 

NKSYFWHKRIHSGEKPYRCQECGRGFSNKSHLITHQRTHSGEK 
PFACRQCKQS FS VKGS LLRHQRTHSGE KP FVCKDCERS FSQKST 
LVYHQR THS GEKPFVCRECGQG FI QKS TL VKHQ I THSE E K?F VC 
KDCGRGFIQKSTFTLHQRTHSEBKPYGCRECGRRFRDKSSYNKH 
LRAHLGEKRFFCRDCGRG FTLKPNLT IHQRTHSGEKPFMCKQCE 
KSFSLKANLLRHQWTHSGERPFNCKDCGRGFILKSTIiLFHQKTH 
SGEKPFICSECGQGFIWKSNIjVKHQLAHSGKQPFVCKECGRGFN 
WKGNLLTHQRTHSGEKPFVCNVCGQGFS WKR S LTRHHWR I HS KE 
KPFVCQECKRGYTSKSDLTVHERIHTGERPYECQECGRKFSNKS 
YYSKHLKRHLREKRFCTGSVGEASS 


6226 
6"227 


29 


266 


TK VS E IjLGGS QRL FFI*PLWRRLCR(!i!6LGPRVS PMAG PR VE VDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 




2581 


890 


MSASSLIiEQRPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP - 

NNAYTAMSDSYLPSYYSPSIGFSYSLGEAAWSTGGDTAMPYIiTS 

YGQLSNGEPKFLPDAMFGQPGALGSTPFLGQHGFNFFPSGIDFS 

AWGNWSSQGQSTQSSGYSSNYAYAPSSLGGAM1DGQSAFANETL 

NKAPGMNTIDO^tytAALKLGSTEVASNVPKWGSAVGSGSlTSNI 

VASNS LP P AT I AP P KPAS WAD I AS K PAKQQPKL KTKNGIAGSSL 

PPPPIKHNMDIGTWDNKGPVAKAPSQALVQNIGQPTQGSPQPVG 
QQ ANNS P P VAQAS VGQQTQPLP PP PPQ PAQLS VQQQAAQ PTR W V 
APRNRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRSIN 
N YN PKD FDWNLKHGRVFI I KS YSEDD I HRS I K YN I WCSTE HGNK 

RLDAAYRSMNGKGPVYDLFSVNGSGHFCGVAEMKSAVDYNTCAG 
/WSQD K WKGRFD VRW I FVKD VPNSQLRH I RLENNENKPVTNS RD 
rQEVPLEKAKQVLKI IASYKHTTSIFDDFSHYEKRQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
1 residue of 
amino acid 
sequence 


Amino acid segment containing signal peptiHi" — 1 
(AeAlanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K«Lysine, j 
L=Leucine, M=Methionine, N=Asparagine, j 
P=Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, | 
W= Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6228 


47 


1978 


GRRCRRRGAVMELAQEARELGCWAVEEMGVPVAARAPESTLRRL 

CLGQGADIWAY1LQHVHSQRTVKKIRGNLLWYGHQDSPQVRRKL 

ELEAAVTRLRAEIQEL0QSLELMERDTEAQDTAMEQARQHTQDT 

QRRALLLRAQAGAMRRQQHTLRDPMQRLQNQLRRLQDMERKAKV 

DVTFGSLTSAALGLEPWLRDVRTACTLRAQPLQNLLLPQAKRG 

SLPTPHDDHFGTSYQQWLSSVETLLTNHPPGHVLAALEHLAAER 

EAEIRSLCSGDGLGDTEISRPQAPDQSDSSQTLPSMVHLIQEGW 

RTVG VT*VS QRS TLLKERQVJbTORIiQGI>VEE VERR VLGSS ERQVL 

ILGLRRCCLWTELKALHDOSQELQDAAGHRQLLLRELQAKQQRI 

LH WRQL VE E TQEQ VRLL I KGNS AS KT RLCRS PGE VLAL VQRKW 

PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP " 

TVLPSIHQLHPASPRGSSFIALSHIOiGLPPGKASELIiLPAAASL 

RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQ I QASQBKQ 

QKENLGQALKRLEKLLKQALERIPELQGIVGDWWEQPGQAALSE 

ELCQGIjSLPQWRLRWVQAQGALQKLCS 


6229 


1S71 


S 5 *° 


G PS LLGTRG TPNPARTLQ I FFL 1 1 G R RI/FG RMAAVDDLQ FEfi FG I 
NAATSLTANPDATTVNIEDPGETPKHQPGSPRGSGREEDDELLG 
NDDSDKTELLAGQKKSSPFWTFEYYQTFFDVDTYQVFDRIKGSL 
L P I PGKNFVRL YIRSNPDLYG P F W I CATL VFA IA I SGNLSNFL I 
HLGEKTYHYVPEFRKVSIAATI I YAYAWIiVPIiALWGFliMWRNSK 
VMNI VSYS FI*EI VCVYGYSLFI YI PTAIIiWI I PHKAVRWI LVMI 
ALGISGSLLAMTFWPAVREDNRRVAIjATIVTIVLLHMLIjSVGCL 
AYFFDAP EMDHLPTTTATPNQTVAAAKS S | 


6230 


1723 


600 


S KMSGRSGKKKMS KLS R S ARAG V I FPVGRLMRVLkKGTFkYRISI 

VGAPVYMAAVIEYIJU^ILELAG»[AARDNKKARIAPRHILIAVA 

NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 

PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 

SEDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 

TTAEIDLKEDIGKALEKAGGKEFLETVKELRKSQGPLEVAEAAV 

SQS SGLiAAKFVIHCH I PQWGSDKCEEQLEET I KNCIiSAAEDKKL 

KSVAFPPFPSGRNCFPKQTAAQVTIiKAISAHFDDSSASSLKNVY 

FI/LFDSESIGIYVQEMAKLDAK | 


6231 


149 


870 


1,1^SSTMDRSIJ^LWSFGFLLI,FTAYGG1,QSLQSSLY"S^E^ 
LGVTALSTLYGGMLLSSMFLPPLLI ERLGCKGTI ILSMCGYVAF 
SVGNFFASWYTLIPTSH.LGLGAAPLWSAQCTYLTITGNTHAEK 
AGKRGKDMVNQYFGIFFLIFQSSGVWGNLIS5LVFGQTPSQETL 
PE E QLTS CGAS D CLMATT TTNS TQR PSQQLVY TLLG IYTGS G VL 
AVLMIAAFLQPIRDVQRESE | 


6232 
6233 


3679 
1 [ 


1476 
2654 J 


tr'VAGTTMAGFWVGTAPLVAAGRRGRWPPQQLMLSAALRTLKHVL 
YYSRQ CLMVSRNLGS VG YD PNEECT FDK I LVANRGE I ACR V I RT C 
KKMGI KTVAIHSDVDASSVHVKMADEAVCVGPAPTSKSYLNMDA 
I ME A I KKTRAQAVHPGYGFLSENKEFARCIiAAEDWFIGPDTHA 
I QAMGDK I ES KLLAKKAE VNTI PGFDG WKDAEEAVR IARE I G Y 

PVMIKASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
LrEKF I DMPRHI E IQ VLGDKHGNALWLNERECS I oppmo vwi? c 

APSIFLDAETRRAMGEQAVAIjARAVKYSSAGTVEFLVDSKKNFY 
FLEMNTRLQVEHP VTEC I TGLDLVQE M I RVAKG YPLRHKQAD I R 
I NG WAVECR VYAED P YKS FGL P S IGRLSQYQE PLHL PG VRVDS G 
I QPGSD I S I Y YDPM I SKLl T YGSDRTE ALKRMADALDNYVI RG V 
THN I ALLRB VI INS R FVKGD I STK FLS DVYPDGF KGHMLTKS E K 
NQLLAIASSLFVAFQLRAQHFQENSRMPVIKPDIANWELSVKLH 
DKVHTVVASl^GSVFSVEVDGSKLNVTSTWNLASPLLSVSVDGT 
3RWQCLSREAGGNMSIQFLGTVYKVNILTRLAAELNKFMLEKV 
IEDTS S VLR S PMPG WVAVS VKPGDAVAEGQEI C VI EAMXMQNS 
viTAGKTGTVKSVHCQAGDTVGEGDLLVELE | 
■IS TRE NLNAGN FNF PSEGHL VRS TG PGGS FAKHMVAQ CVS P KGP ( 



470 



WO 01/53312 



PCT/US00734263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«*Cysteine, D-Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P» Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, Vs= Valine, 
W~Tryptophan, Y=Tyrosine, X=Unknown, *~Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRTYFFGATHVP YLGGDS KLPKKTEQ I RLLSQ I YAAV I EAV 
LAGIACYAKTSSLT3CAKEVAEQTLGSGLDS FELI PFKAALRS KM 
TFHIHAVNNQGRI VPLDSBDSIiS F VKTACMAVYDI PDIiLGGNGC 
LGSWFSES FLTSQ I LVXEKDGTVTTETSS WLTAAVPRFCS WL 
VEDNE VKLS EKTHQAVRGD E S FI*GT YLTG GEGAYI/YS S NLQS W P 
EEGNVHFFSSGLLFSHCRHGS 1 1 ISKDHMNSISFYDGDSTS TVA 
ALLIDFKS SLLPHLPVHFHGSSNFLMIALFPKSKI YQAFYS BVF 
SLWKQQDNSGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPEIiDWFLQHFAISSISQEPVMRTHLPVLLQQ 
AE I NTTHRI ESDKVI I S I VTGL PGCHAS EliCAFLVTLHKECGRW 
MVYRQIMDSSECFHAAHFQRYLSSALEAQQNRSARQSAYIRKKT 
RLL WLQG YTDVI D WQALQTH PD SNVKAS FT I GAI TACVE PMS 
CYMEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRHPLLVQLQSL 
IRAANPAAAFI LAENGI VTRNED I ELI LS ENS FSS PEMLRSRYL 
MYPGW YEGKLNAGS VYPLMVQ I CVWFGRPLEKTRFVAKCKA1 QS 
S I KPS PFSGNI YH I LGKVKFSDS ERTMEVCYNTLANSLSI MP VL 
EG PTP PPD S KS VS QDS SGQQE C YLVFIGCSLKEDS I KDWLRQS A 
KQKPQRKALKTRGMLTQQEI RS I H VKRH JjE PLPAGYFYNGTQFV 
N F FGDKTD FHPLMDQFMNDYVE B ANRE I BKYNQELEQQ E YHDIjF 
ELKP 


6234 


1731 


404 


P R VREDM DH KS PGNKGS I>V YAG I KS I VKS S LGMVE S S R HNW S GL> 
DKQSDIQNLNEERILALQLCGWIKKGTDVDVGPFLNSLVQEGEW 
E RAAAVALFNLD I RRAI Q ILNEGASS E KGDljtfLNVVAMAIjSGYT 
DEKNSLWREMCSTLRLQLNNPYLCVMFAFLTSETGSYDGVIjYEN 
KVAVRDR VAFACKFLSDTQLNR Y I EKLTNEM KEAG N L EG I L».LTG 
LTKDGVDLMES YVDRTGD VQTAS Y CMT iQG S PLDVLKDERVQY W I 
ENYRNIJjDAWRFWHKRAEFDIHRSKLDPSSKPLAQVFVSCNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKLAQFNNWFTWCHN 
VAtnavMiAiaMMAjci W r KUnAttCf VSACx CKCMQ LDTTGNIj VPAETV 

QP 


6235 


1 


571 


EKRDHRLPSWPRAALKVPORGGRVGTTPELAAGGIMATRNPPPQ 
DYESDDDSYEVLDIjTEYARRHQWWNRVFGHSSGPMVEKYSVATQ 

IVMRfiUTf^WfllfSTJT.WnWTmVT 7\ Ti T 1 7M Trinf dt t rr\TnrTicn\nrriT 
4. vrioov iun wu»JL>c U&vimvuAA lAVuuur liLiljyi.AoHoGYVCI 

dwkrvekdvnkakrqikkrankaapeinnlieeatefikqnivi 

SSGFVGGFLLGLAS 


; 6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKEL S RS AK KCD KE EKAEKAKI KKAI QKGNME VAR fHAE 
NAI RQ KN Q A VN FLRM S AR VDAVAARVOT A VTMGKVT KS NAG VV K 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVMDVNTALQE VLKTALIHDGIiARG I REAAKA 
LDKRQAHLCVIASNCDEPMYVKLVEAIjC^HQINLIKVDDNKKIi 
GEWVGLCKIDREGKPRKVVGCSCVVVKDYGKESQAKDVIEEYFK 
CKK 


6238 


? 


4666 


EBVPTQES VKWEINVI IKNPEI VFVADMTKNDAPALVI TTQCEI 
C YKGNLENSTMTAAI KDLQVRACP FLPVKRKGKI TTVLQPCDLF 
YQTTQ KGTD PQ VI DMS VKS LTLKVS ? VI INTM I T I TS AL YTT KE 
TI PEETAS STAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEM I KMNI DS I FI VLEAG IGHRTVPMLLAKS R FSGEGKtn^ S SL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEIDQTEDFRPWNLGIK 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
L VMIJNNL VKAFT2 AATGS S ADFVKDLAP FK I LNSLGLTI S VS PS 
DSFSVLNIPMAKSYVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFILLTPVNHSTADKIPLTKVGRRIiYTVRHRESGVERSIVCQI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G^Glycirie, 
H=Histidine, I=»Zsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V«»Valine, 
W=Tryptophan, Y ■'Tyrosine, X«Unknown, ***Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 










DTVEGSKKVTIRSPVQIRNHFSVPLSVYEGDTLLGTASPENEFN 
I PLGSYRSFI FLKPBDENYQMCEG I DFEE 1 1 KNDGALLKKKCRS 
KNPSKESFLrNIVPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RNLLPYKIAYYIEGIENSVFTLSEGHSAQICTAQLGKARLHLKL 
LDYI^HDWKSEYHIKPNQQDISFVSFTCVTEMEKTDLDIAVHMT 
YNTGQT WAFHS P YWMVNKTGRMLQ YKADGIHRKHP PN YKKPVL 
FS FQPNH FFKNNKVQLM VTDS BLSNQ FS I DT VGS HG AVKCKGLK 
MDYQVGVTI DLSS FNI TRIVTFTPFYM I KNKS KYH I S VAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYKNKQENC 
ILLRLDNELGGIIAEVNLAEHSTVITFLDYHDGAATFLLINHTK 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKMRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES EKAELAEQEI AVALQDVG I S LVNNYTKQE VAY IGI TSSD 
VVWETKPKKKARWKPMSVKOTEKLEREFKEYTESSPSEDKVIQL 
DTNVPVRLTPTGHNMKILQPHVIALRRNYLPALKVEYNTSAHQS 
SFRIQIYRIQI QNQ I HG AVFP F VF Y P VKP P KS VTMDSAPK PFTD 
VS IVMRSAGHSQISRI KYFKVLIQEMDLRLDLGFI YALTDLMTE 

aevtentevelfhkdi eafkee yktaslvdqsqvslyey FHISP 

I KLHLS VSLSSGREEAKDS KQNGGL I PVHS LNLLLKS I GATLTD 
VQDWFKLAFFELNYQFHTTSDLQSEVIRKYSKQAIKQMYVLIL 
GLDVLGNPFGLIREFSEGVEAFFYEPYQGAIQGPEEFVEGMAIiG 
LKALVGGAVGGLACAAS KI TGAMAKGVAAMTMDEDYQQKRREAM 
NKQPAGFREG I TRGGKGLVSGFVS G ITG I VTKPI KGAQKGGAAG 
F FKGVGKGLVGAVAR PTGG 1 1 D MAS STFQG I KRATETS EVES LR 
P PR FFNEDG VI RP YRLRDG TGNQMLQKIQFYRE W I MTKS S SS DD 
DDDDDDDDES DLNH 




6239 


2108 


634 


KPG^GKGSSGRRPLLUSLLVAVATVHLVICPYTKVEESFNLQA 

TKDLL YHWQDLEQ YDHLE FPGWPRTFLGP WIAVFSS PAVYVL 

SLLEMSKFYSQLI VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM 

FCWVTAMQFHI^FYCTRTLPNVIJtfJPVVLLAIA^ 

WLSAFAI IVFRV3LCLFLGLLLLLALGNKKVSWRALRHAVPAG 

ILCLGLTVAVDS YFWRQLTWPEGKVLWYNTVIiNKSSNWGTS PLL 

WYFYSAIjPRGLGCSLLFIPI^LVDRRTHAPTVIJU^GFKALYSLL 

PHKELRFI I YAFPMLNI TAARGCS YLLNNYKKSWIiYKAGSLLVI 

GHLVVNAAYSATALYVSHFNYPGGVAMQRLHQLVPPQTDVLLHI 

D VAAAQTG VS R FLQVNS AWR YDKREDVQPGTGMLAYTH I LMEAA 

PGLI.ALYRDTHRVLASVVGTTGVSIiNLTQLPPFNVHLQTKI.VLIj 

ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTS IAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGF3LGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDF3SVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHR3PSPVRYDNLSRHIVASLQEREKL 
LRQS P PLPGREEE PGLGDS G I QSTPGSGHAPRTS S S S DDS KRS P 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQPGVSETEEVALgPLLTPKDEVQLKTTYSKSNGQPKSLGS 
ASPGPGQPPLSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAE E KKRLS LQRE KI IARVS IDNRTRALVQALRRTTDPKLCIT 
RVEELTFHLLEFPEGKGVAVKERIIPYLLRLRQIKDETLQAAVR 
EILAL IGYVDPVKGRGIRI LSIDGGGTRGWALQTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVIVGTVKMSWSHAFYDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQAIRASSAAPGYFAEYALGjVDLHQDGGLLLNNPSAIiAMHECKC 
LWPDVPLECIVSIiGTGRYBSDVRNTVTYTSLKTKLSNVINSATD 
TEEVHIMLDGLL PPDTYFRFNPVMCENI pldesrnekldqlqle 
GLKYIERNEQKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alar.ine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G«Glycine, 
H»Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Mcthionine, N=Asparagine , 
P=Proline, Q=Glut amine, RsArginine, 
S^Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *sStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
FFSKL — ' 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSIiAAFPRPAAPRRAVEMGE" 
SS ED I DQMFSTLLGEMDLLTQS LGVDTLPPPDPNP PRAEFNYS V 
GFIOSIiNESIiNAIjEDQDLDALMADLVADISEAEQRTIQAQKESLQ 
NQHHS AS LQAS I FSGAAS LGYGTNVAATG 3 SQ YEDDL P P P P ADP 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKLALEKLKEAKVKKLV 
VKVHMNDNSTKSLMVDERQLARDVLDNLPEKTHCDCNVDWCLYE 
IYPELQIERFFE DHENWE VL S DWTRDTENKI L FLEKEE ky&up 

KNPQNFXLDNRGKKESKETNEKMNAKNKESLLEVRLILQSGRKE 
KDVCS I FKS FASENKGKI 


6243 


1509 


614 - 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TS RAS S RRLACG PQTRAGAETRS TAM I RANS AARDTRRATCRSA 
AGTPS PTTMTCLTDVPTGCAAVEPTARLPAAAWAS TITTGCCPA 
MGQAGAGPAGRKGSEAGGG PGRAHHAHPS PLPREPRVRTG P PAH 
SPTPGSIDPS PELSKGSAGVTQES PLLDPVDFIiLFRTRAVDPLR 
RVFFFFYQHLTFFS IQPQPPPCHAFHPRDPPAGTKRQL IIjVPIjK 
GPPILAPILSLTPILSRWSCYFPRSRTAOnwHT.q 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKJLQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI I E YN KELQAKVN I LRRQLAELETEDGMQES P 


6245 


81 


114B 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
I CI SIjAFW I ISMTASTY YGNLRP 1 3PWRWLFS WVP VL I VSNGL 
KKKSLDHSGALGGLWG Fl LTIANFS FFTSLLMFFLSS SKLTKW 
KGEVKKRLDSEYKEGGORNW\/r>vpr , Kff3avDTT?T ar r wYptrnnn 
E I P VD FS KQYSAS WMCLS LLAALACS AGDTWAS E VGP VLS KSS P , 
RL I TTWE KVP VGTNGGVT VVGLVSSLLGGTFVG I AYFLTQL I FV 
NDLDISAPQWPIIAFGGLAGDLGSIVDSYLGATMQYTGLDESTG 
MWNS PTNKARHIAGKP ILDNNAVNL FSSVL IALLL PTAAWGFW 
PRG 


624S ""' 


1177 


359 


SLWPWIIiMDDSLMQISLQLLCVYTANFPNGCSSLCWSSCGQHPV 
QATHRGAVSNS LMLCI L KLASQMPLE2TTT VQQMVFMLLS NLALS 
HDCKGVlQKSNFIiQNFLSLAIiPKGGNKHLSNLTILWLKLLLNIS 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLL I FHNVCFS 
PANKPKI LANE KV I T VLAACLESENQNAQRIGAAALWAL I YNYQ 
KAKTALKS PS VKR R VDEA YS LAKKTFPNSEANPLNAYYLKCLEJNT 
LVQLLNSS 


6247 


3 


1678 


NSRVWGP WTE PS AGSLRPMARKQNRNS KELGLVPLTDDTSHAGP 

pgpgrallecdhlrsgvpggrrrxdwscsllvaslagafgssfL 
yg ynl s wnapt p y i kaf ynes werrhgrp i d pdtltllws vtv 
sifaigglvgtlivkmigkvlgrkhtliianngfaisaalij^cs 

LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
QVTAI FI CI GVFTGQLLGLPELLGKES TWPYLFGVI WPAWQL 
LSLPFLPDSPRYLLLEKHNEARAVKAFQrFLGKAHVSQEVEEVL 
AESRVQRSIRLVSVLELLRAPYVRWQVVTVIVTMACYQIjCGLNA 
1 WFYTNS I FGKAG I P PAKI P YVTLSTGG IETLAAVFSGLVI EHL 
GRRPLLIGGFGLMGLFFGTLTITLTLQDHAPWVPYLSIVGILAI 
IASFCSGPGGI P FI LTGEFFQQSQRPAAFI I AGTVNWLSNFAVG 
LL FP F I OKSLDT YCFL VEAT I CI TOA I YLYF VLPETKNRT YAE I 
S QAFS KRNKAYP PE EK I DS AVTDGKINGR P 


6248 


56 


1773 


V P PPRMMAAVP POLE ?WNR VR I PKAGNRS AVT VQN PGAALDLCI — 
AAVI KBCHLVlLSLKSQTIiDAETDVLCAVLYSNHJNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGS I QDLFELFS SNENQPLTTKVCWP 
S Q P WEL VLMKVLGACKLL LR LLD CC CKTFLLTVKHIiGLQEF 1 1 
LNIiVMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQEVARIQPMP 
yFIGOFTFPSDlTEFLGQPYFEAFKKKMPIAFAAKGIWKLLNKLF 
LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 



473 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted encT 
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location 
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to first 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G»Glycine, 
H^Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=«Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 








KEES S EFD VRAFCNQLKHKATQE TS FD FKCS QS RLKTTK YS SQK 
VIGTPHAKS FVQRFREAES FTQLS EE IQMAWWCRS KKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKICLECIKTSICNHLLRGSGIK 
TS KHHLRQRR SQNKFLRRQRKPQRKLQSTLLRE IQQ FSQGTRKS 
ATDTSAKWRLSHCTVHRTDIjYPNSKQLLNSGVSMPVIQTKEKMI 
HENLRG IHENETDS WTVMQ INKNSTSGTI KETDDIDD I FALMGV 


6249 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHXPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCWP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 
LNLVMVGLVS RLWVL YKG VLKR L I LL YB PLFGL LQE VAR I QPM P 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLIjNKLF 
L INEQSPRASEETLLGI S KKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKS FVQR FREASS FTQLS EE IQMAWWCRS KKLKAQAI 
FLGNKLLKSNRLKHLE AQGTSLP KKLECI KTS I CNHLLRGSG I K 
TSKHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTS AKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVIQTKEKM I 
HENLRG IHENETDS WTVMQINKNS TS GT I KETDDIDD I FALMGV 


6250 


232 


1306 


LAALHIMALPFRKDLEKYKDLDEDELLGNLSETELKQLETVLDD 
LDPENALLPAGFRQKNQTSKSTTGPFDREHLLSYLEKEALEHKD 
RED YVPYTGEKKGKI FI PKQKPVQTFTEEKVSLDPELEEALTSA 
SDTELCDLAAILGMHNLITNTKFCNIMGSSNGVDQEHFSNWKG 
EKI LPVFDE P PNPTNVEES LKRTKENDAHLVE VNLNNI KNI PI P 
TL KD F AKALETNTHV KC FS LAATR SNDPVATAFAEMLKVNKTLK 
SLNVESNF I TG VG I LAL I DALRDNETLAE LK I DNQRQQLG TAVE 

LEMAKMLEENTNILKFGYQFTQQGPRTRAANAITKNNDLVRKRR 
VBGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE " 
PRGCGAAPGRTLH FTAAVPAGHNKWS KVRH I KG PKD VE R S R I FS 
KLCLNI RLAVKEGGPNP EHNSNLANILE VCRS KHMPKSTIETAL 
KM E KS KDT YLL YEGRGPGGS SLLI EALSNS S HKCQAD I RH I LNK 
NGG VMAVGARH S FDKKG V I WEVEDREKKAVNLERALEMA I BAG 
AEDVKETEDEE ERNVFKF ICDASSLHQVRKKLDSLGLCSVSCAL 
EFIPNSKVQLAEPDLEQAAHLIQALSNHEDVIHVYDNIE 


6252 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE " 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKiCPQV 
PKKPRBWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRSRLEVAEAEEEETS IKAARSELLLAEEPGFLEGE 
DGEDTAKICQADIVEAVDIASAAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCEINVMEAVRDIRFLHSEALL 
AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASE 
TGFLT YLDVS VG K I VAALNARAGRLD VMSQNP YNAVIHLGHS NG 

TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 

KIFDrjRRTVOPT.CTOTT.OMi^a^UT apcnonr Tiri\pMnrMnn>TYrii. 
JV "*- *■ *^*J*w *■ * V* J->0 A iv X Lirrt\3A\3ti±u\r zs\Jit\jLiL>v ACaMGDWNI WA 

GQG KAS P PS LEQ P YLTHRLS G P VHGLQFCP FED VLG VGHTGGI T 

SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPABLIC 

LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 

SSTASLVKRKRKVMDEEHTvDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVFPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKNrPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRSRLEVAEAEEEETS IKAARSELLLAEEPGFLEGE 
DGEDTAX I CQAD I VE AVD IAS AAKHFDLNLRQFGP YRLN YS RTG 
RHIAFGGRRGHVAALDWVTKKLMCEINVMEAVRDIRFLHSEALL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

1 yga «? i m o ■F 

I amino acid 
J sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

1 amino acid 

1 sequence 


Amino acid segment containing signal peptide " 
<A*=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, OGlutamine, R=Arginine, 
S=»Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyroeine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








avaqnrwlhiydnqgielhcirrcdrvtrleflpfhfliataseH 
tgfltyldvsvgkivaalnaragrldvmsqnpynavihlghsng 
t v£>lws pam ke plaki lchrggvravavdstgtymatsgldhql 

K I FDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRAIAEVDVISLEQGKKEQIERU3YDPQAKAPFQPKPKQKGR 
SSTAStiVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 




1 . 1139 


HALG RRGG S Q E LS AAACGCFALRLRAPGS GR PALA PGAAA FAGL 
GGAPRFPP RGS AAGRTMLLKE YR I CMPLTVDE YKIGQL YM I S KH 
SHEQSDRGEGVE WQNE PFEDPHHGNGQFTEKRVYLNS KLPSWA 
RAWPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIHIETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
SBKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
HKVVRDILLIGHRQAFAWVDEWYDMTMDDVREYEKN^EQTNIK 
VCNQHSSPVDDIESHAQTST j 


6255 


1 


| 1444 


PTRPQQELLVSLATVI FVASQKALS VESKAVI KQQLESVSNGWT~] 
VYR I ARQASRMGNHDMAKELYQSLLTQ VASKH F YFWLNS LKEFS 
HAEQCLTGLQEENYSSALSCIABSLKFYHKGIASLTAASTPLNP 
LSFQCEFVKLRIDLLQAFSQLICTCNSLKTSPPPAIATTIAMTIj 

gndlqrcgrisnqmkqsmeefrslasrygdlyqasfdadsatlr 

NVELQQQSCIiLISHAIEALILDPESASFQEYGSTGTAHADSEYE 
RRMMS V YNHVLEEVESLNGKYTPVS YMHTACLCNAI I ALLKVPL 

sfqryffqklqstsiklalspsprnpaepiavqnnqqlalkveg 
wqhgskpglfrkiqsvclnvsstlqsksgqdyki pidnmtnem 
eqrvephndyfstqfllnfailgthnitvessvkdangivwktg 
prttifvksledpysqqirlqqqqaqqplqqqqqrnaytrf 


6257 I 


1 


1542 


crgagaepaanprsprs lvpsleststsvppapgtmatds W ALA 
vdeqeaaaeslsnlhlkeekikpdtngavvktnanaektdebek 
edraaqsllnklirsnlvdntnqvevlqrdpnsplysvksfeel 

RLKPQLI^dVYAMGFNRPSKIQENALPIiMIAEPPQNLIAQSQSG 
TGKTAAFVLAMl»SQVEPANKYPQCLCI*SPTYEIiALQTGKVI EQM 
GKFYPELKIiAYAVRGNKLERGQKISEQIVIGTPGTVIjDWCSKIjK 
FIDPKKIKVFVLDEADVMIATQGHQDQSIRIQRMLPRNCQMLLF 
SATFEDSVWKFAQKWPDPNVIKLKREEETLDTIKQYYVLCSSR 
DEKFQALCNLYGAITIAQAM I FCHTRKTAS WLAAELS KEGHQVA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTWCARGIDVEQVSVV 
INFDIiPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDSKHSM 
NI LNR IQEHFNKK I ERLDTDDLDE I EKI AN J 




210 j 


615 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLflH 
NNIVKEELALLDGSNWFKLLGPVLVKQELGEARATVGKRLDYI I 
TAEIKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 


6258 


210 


615 


AFIPAJMAELIQKKLOGEVEKYOOLOKDLSK^MqRRnK r p&nr h>d 1 

NNIVKEEl^LIXSSNWFKLLGPVLVKQELGEARATVGKRLDYr 

TAErKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 


6259 


2 


1540 

( 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWEIiLFHNNKTV — 1 
S VENGDRGSKTFNLGTDPVSLRN YPYKI CDSCEMNhKNISGIj I X 
S KKNCSR KK P DE FNVCEKLLLD I RHE K I P IGEKS YK YDQKRNAI 
N YHQDLS Q PS FGQS FE YS KNGQGFHDEAAF FTNKRSQ I GETVCK 
5TNECGRTFI ESLKLNI SQRPHLEMEP YGCS ICGKS FCMNLRFGH 
2RALTKDNPYEYNEYGEIFCDNSAFI IHQGAYTRKILREYKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
3EKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ | 
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SEQ 
ID 

NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E=x 
Glutamic Acid, Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N*»Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGE KP YECKQCGKTFCVKSNLTEHQRTHTGEKP '" 
YE CNACGKS FCHRS ALTVHQRTHTGEKPPI CNECGKSFC VXSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRSVLTKHQRIHTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMALREIiKVCLLGDTGVGKSSIVW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMYYRGSAAAIIVYDITKEETFSTLKNWVKELRQHGPPNI 
WAI AGNKCDLIDVRE VMERDAKD YADS IHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYR1»GPGTRS R WPRRGS WAASLVPRGPS PAALVTS P CP PDPLR 
SPACE PCRPDFAPRPALLIiRSGPRSAPAVTGKPALKGQPGPWPG 
MAEVS IDQSKLPGVKEVCRDFAVLEDHTLAHSLQEQE IEHHLAS 
NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDIiEQQDCEIA 
QE I QEKLA I EAERRRIQEXKDED I ARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
EIARKLQEEELLATQVDMRAAQVAQDEE IARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RPPPPIMTDGEDADYTHFTNQOSSTRHFSKSESSHKGFHYKH 


6262 


2 


1759 


PECHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEEiJj 
GSTRIiVSQGLEALRSEHQAVLQST.SQTIECLQQGGHEEGLVHEK 
ARQLRRSMENI ELGLSEAQVMLAIASHLSTVESEKQKLRAQVRR 
LCQE NQWLRDELAGTQQRLQRS EQAVAQLEE E KKHLE FLGQL RQ 
YDEDGHTSEEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGArA 
AQQGG YE I PARLRTLHNLVI QYAAQGRYEVAVP LCKQALEDL ER 
TSGRGHPDVATMLNIIiAliVYRDQNKYKEAAHLLITDALS IRESTL 
GPDHPAVAATLNNLAVLYGKRGKYKEAEPLCQRAJjEIREKVLiGT 
NHPDVAKQtiNNIjAIiLCQNQG KYEAVER YYQRALA I YEGQLG PDN 
PNVARTKNNLASCYLKQGKYAEAETLYKEIIiTRAHVQEFGSVDD 
DHKP IWMHAEEREEMSKSRHHEGGTP YAEYGGWYKACKVSS PTV 
NTTLRNLGALYRRQGKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGES DGRRTS QEGPGDS VKFEGGEDAS VAVE WS GDGS GTLQR 
SGSLGKIRDVLRR 


6263 


1 


2408 


RELDSLADLPERIKPPYANGLSTSHIjRSSSVEDVKLiIISEGRPT 
IEVRRCSMPSVICEHTKQFQTISBESNQGSLLTVPGDTSPSPKP 
EVFSNVPERDLSNVSNIHSSFATSPTGASNSKXVSADRNLIKNr 
APVNTVMDS'PVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDFIC 
PNSN I PDQESSLQS FCNSENKVLKENADFLSLRQTELPGNSCAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 
KNAVQ 1 1 SSALDTDNESTKDTBNTFVLGDVQKTDAFVPVYS DST 
I QE AS PNFEKA YTIiP VL P S EKDFNGS DASTQLNTKYAFS KLT YK 
SSSGHEVENSTTDTQVISHEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNEQDPKHCPESEICCLIjSIEDEESQQSILSSIiENHSQ 
QSTQ PEMHKYGQIiVKVELEEif AEDDKTENQI PQRMTRNKANTMA 

NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VS RVP 0 P VO VS PS 1* tif) A TCP If TnnQr. a a runQT vr nvrnnvocinn 

ANPYFEYLHIRKKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRJbQHSIE 
REKLI VSNEQEVLRVHYRAARTLANQTLP FSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 

QHEAAALNAVQRLEWQLKLQELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTPI 


6264 " 


143 


1960 


KHRQEWNALDMAPEIWI^PMCLIENTNGELVANPEALKILSAi 
TQPWWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDS WI FTLAVLL 
SSTLVYNS MGTINQQAMDQLYYVTELTHR IRSKSS PDENENEDS 
ADFVSFFPOFVWTIiRDFSLDLEADGQPLTPDEYLEYSLiCXTQGT 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L^Leucine, M=rMethionine, N»Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFWLPRbCIKKFFPKKKCFVPDLPIHRRKLAQLEKLQDE' 
ELDPEFVQQVADFCS Y I PSNSKTKTLSGGI KVNGPRLESLVLTY 
I NAI S RGDL PCMENAVLALAQ I ENS AAVQKA IAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNSFKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQVIFSPLEEEVKAG I YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQ IliTEKEKE I EVECVKAES AQASAKMVEEMQ1 K YQQMMEE K 
EJCSYQEHVKQLTEICMERERAQLLEBQEKTLTSKLQEQARVLKER 
CQGESTQLQNE IQKLQ KTLKKKTKR YMS H KLKI 


6265 


143 


1960 


KHRQENNALDMAP E I HMTG PMCLI E NTNG E L VANPEALKI LS AI 
TQPVVWAIVGLYRTGKSYLMNKIiAGKNKGFSLGSTVKSHTKGI 
WM W CVP H P KK P EHTLVLLDTEGLGD VKKGDNQNDS W I FTLAVLL 
S S TLVYNS MGTINQQAMDQLY Y VTELTH R I RS KS S P DENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKIiAQLEKLQDE 
ELDPEFVQQVADFCSYI PSNSKTKTLSGGI KVNGPRLESLVLTY 
INAI S RGDL P CMENAVLALAQ I ENSAAVQKA IAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNSFKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDR CSALLQVIFSPLEEEVKAGI YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGI QAEE I LQTYLKSKES VTDAIL 
QTDQ1LTEKEKEIEVECVKAESAQASAKMVEEMQIKYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQGIGLDGIPEVTASE 

gftvneinkks ihiscpkenasskflapyttpsrihtks itcld 
issrgglgvssstdgtmkiwqasngelrrvleghvfdvnccrff 

PSGLWLSGGMDAQLKI WSAEDASC WTFKGHKGGI LDTAI VDR 

grnwsasrdgtarlwdcgrsaclgvladcgssingvavgaadn 
sinlgspeqmpserevgteakmlllaredkklqclglqsrqlvf 

LFIGSDAFNCCTFLSGFLLIjAGTQDGNIYQLDVRSPRAPVQVIH 

rsgapvlsllsvrdgfiasqgdgscfivqqdldyvteltgadcd 
p vykvatwe kq i ytccrdglvrryqlsdl 


6267 


3 


622 


lg mmkknns akrg pqdgnqqpa p pekvgwvr kfcgkg i fre i w k 
nryvvlkgdqlyisekevkdekniqevfdlsdyekceelrksks 
rskknhskftlahskqpgntapnliflavspeekeswinalnsa 
itraknrildevtveedsylahptrdrakiqhsrrpptrghlma 
vaststsdgmltldl iqeedpspeeptslc 


6268 


160 


1368 


HRELCQNLPAGLSSAL I DNPLTLLLS IDTYVMLQBP VTFQDVAV 

dfsreewgllgptqrteyrdvmletfghlvsvgwettlenkela 
pnsdipeeepapslkvqbssrdcalsstledtlqggvqevqdtv 
lkqmesaqekdlpqkkhfdnresqansgaldtnqvslqkidnpe 
sqansgaldtnqvllhki pprkrlrkrdsqvksmkhnsrvkihq 
kscerqkakegngcrktfsrstkqitfirihkgsqvcrcsecgk 
ifrnpryfsvhkkihtgerpyvcqdcgkgfvqsssltqhqrvhs 
gerpfecqecgrtfndrsaisqhlrthtgakpykcqdcgkafrq 
sshlirhqrthtgerpyacnkcgkaftqsshlighqrthnrtkr 
kkkqpts 


6269 


2886 


1449 


hasaptrrnmaaasplrdchawkdarlplsttsneacklfdatl 
tq yvicwtndks lgg I egclsklkaadptf vmghamatglvli gt 
gssvkldkeldlavktmveisrtqpltrreqlhvsavetfangn 
fpkacelweqilqdhptdmlalkfshdayfylgyqeqmrdsvar 
iypfwtpdiplssyvkgiysfglmetnpydqaeklakealsinp 
tdaws vhtvah ihemkae I kdglefmqhsetlwkdsdmlachny 
whwalyliekgeyeaaltiydthilpsijc^ndamldvvdscsml 
yrlqmegvsvgqrwqdvlpvarkhsrdhillfndahflmaslga 
hdpqttqellttlrdases pgencqhllardvglplcqalvbae 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=lsoleucine, K*=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine / 
S^Serine, T=Threonine, V*Valine, 
W«Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGNPDRVliELLLPIRYRIVQI^SNAQRDVFNQLLIHAALNCO^- 
S VHKNVARS LLMERDALKPNS PLTERLI RKAATVHLMQ i 


6270 


23 


2086 


SVTVTLGSEGDGRPPTYHLEEMEQEPQNGEPAEIKIIREAYKKA 
FL F VNKG LNTDELGQ KEEAKN YYKQG I GHLLRG I S I SS KE S EHT 
GPG WESARQMQQKMKETLQNVRTRLE I LE KGLATS LQNTDLQE VP 
KLYPEFPPKDMCEKIjPEPQSFSSAPQHAEVNGNTSTPSAGAVAA 
PASLSLPSQSCPAEAPPAYTPQAAEGHYTVSYGTDSGEFSSVGE 
EFYRNHSQPPPLETLGLDADELIIjIPNGVQIFFVNPAGEVSAPS 

ypgylr i vrfldnsldtvlnr ppgflq vcd wlyplvpdrspvl k 
ctagaymfpdtmlqaagcfvgwlsselpeddrelfedllrqms 
dlrlqanwnraeeenefqipgrtrpssdqlkeasgtdvkqldqg 
nkdvrhkgkrgkrakdtsseevnlshivpcepvpeekpkelpew 
sekvahnilsgaswvswglvkgaeitgkaiqkgasklreriqpe 
ekpvevspavtkglyiakqatggaakvsqflvdgvctvancvgk 
elaphvkkhgsklvpeslkkdkdgkspldgamwaassvqgfst 
vwqglecaakcivnnvsaetvqtvrykygynageathhavdsav 
nvgvtaywinnigikamvkktatqtghtlledyqivdnsqrenq 
egaanvnvrgekdeqtkevkeakkxdk 


6271 
! 6272 


32 


ld*8 


GCGVKTAGMVGREKELS IHFVPG S CRLVEEE VNI PNRRVIiVTGA~~ 
TGLLGRAVHKEFQQNNWHAVGCX3FRRARPKFEQVNLLDSNAVHH 

I ihdfqphvi vhcaaerrpdwenqpdaasqlnvdasgnlakea 
aavgafliyissdyvfdgtnppyreedipaplnlygktkldgek 

AVLENNLGAAVLRIPILYGEVEKIiEESAVTVMFDKVQFSNKSAN 
MDHWQQRFPTHVKDVATVC^QLAEKRMLDPSIKGTFHWSGNEQM 
TKYEMACAIADAFNLPSSHLRPITDSPVLGAQRPRNAQLDCSKL 
ETLG IGQRTPFR IGI KESLWPFLIDKRWRQTVFH 




1136 


528 


G AVME DAAAP GR TEG VL ERQGAP P AAGQGGALVELTPT PGGIiAI* 
VSPYHTHRAGDPLDLVAIAEQVQKADEFIRANATNKLTVIAEQI 
QHLQEQARKVLEDAHRDANLHHVACNIVKKPGNIYYLYKRESGQ 
QYFS I ISPKE WGTSCPHDFLGAYKLQHDLSWTP YEDIEKQDAKI 
SMMDTLLSQSVALPPCTEPNFQGLTH 


6273 
6274 


256 


843 


b CPR VS PE GRS LGCQ VMFS LPIjNCS PDH1 RRGS CWGRPQDLXI A 
SAA WNS KCHPGAGAAMARQHARTLW YDR PRYVFME FCVEDS TDV 
HVLI EDHRI VFS CKNADG VELYNE IE FYAKVNS KDSQDKRS SRS 

ITCFVRKWKEKVAWPRLTKEDIKPVW1>SVDFDNWRDWEGDEEME 
IoAHVEHYAEVRDNTYCVIiPT 


J 6275 


56 

20 1 


1142 


AARAMAAAAGGGAGAARSLSRFRGCI^AIiIXSDCTGSFYEAHDT ~ 
VDLTS VLRHVQ S L EPDPGTPGS ERTEALYYTDDTAMARALVQS L 
LAKEAFDE VDMAH RFAQE YKKD PDRG YGAG WTVFKKLLNP KCR 
DVFEPARAQFNGKGS YGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHA5SIA3YNGAILQAIAWLALQGESSSKHFLKQDIIGHMED 
LEGDAQSVLDAREIiGMEERPYSSRLKKIGELLDQASVTREEVVS 
ELGNGIAAFESVPTAI YCFLRCMEPDPEI PS AFNS LQRTI>I YS I 
SLGGDTDT I ATMAGAIAGAYYGMDQVPES WQQS CEG YEETD I LA 
QSLHRVFQKS 






565 


SRRGRARCLARGSRRPUPRPAKTMAFMVKTMVGGQLKWLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 
KAERATLRSHFRDKYRLPKNETDESQIQMAGGDVEIiPRELAKMI 
EEOTE EE EEKAS VLGQLAS L PGLNLGS IiKDKAQATLGDLKQS AE 
KCHVM 


6276 


137 


FT 

: 
] 
< 


TLLPLPPLPCrrEGMILLNTGLEGTVAENPVPIVHTPSGNILTLE^ 
S CLQQLATHPGHWG I HLQIAE PAALR P S LALLARLS S LGLLHWP 

WVGAKrSHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
tiGSGYREQLLTDMLELCQGLWQ PVS FQMQAMtiLGHS TAGAIGRI* 
LASS PRATVTVEHN PAGGDYAS VRTALLAARAVDRTRVY YRL PO 
3 YHKDL ZiAHVGRN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" — 
(A=Alanine, OCysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F=* Phenyl alanine, G^Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
I*«Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, +=»Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 


6277 


4600 


2744 


MAFRTEMGLY YS YFKT I VE APS FLNG V WM I MNDKLTE YP L VI NT 

IiKRFNLYPEVIIASWYRIYTKIMDLIGIQTKICWTVTIGEGLSP 
TESCEGLGDPACFYVAVTPYT TaTCT.MMZVT wct vr"rvr cnonT /-v-»t 

VTVLCFFFNHGECTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
TKLYRGSLIALCISNVFFMLPWQFAQFVLLTQIASLFAVYVVGY 
IDICKLRKI I YIHMISLALCFVLMFGNSMLIiTSYYASSLVI IWG 
ILAMKPHFLKINVSELSLWVIQGCFWLFGTVILKYIiTSKIFGIA 
NDAH I GNLLTS KFFS YKDFDTLL YT CAAEFDFME K3TF LR YTKT ■ 
LLLP WLVGFVAI VRKI I S DMWG VIAKQQTHVRKHQFDHGE I» VY 1 

iifiuyijjjrtl *«XAJXXJAf1KijJ\Jj * ij 1 ir WM\- VMA&Ij I CS RQjjFGWLFC 1 

KVHPGAI VFAIIjAAMS IQGSANLQTQWWI VGEFSNLPQEEL IEW 

I KYSTKPDAVFAGAMPTMAS VKLSALRP I VNHPHYEDAGLRART 

KIVYSMYSRKAAEEVKRELIKLKVNYYILEESWCVRRSKPGCSM 

PEIWDVEDPANAGKTPLCNIjLVKDSKPHFTTVFQNSVYKVLEW 
KE 


6278 


3 


823 


IDFRuVLLSLVYLLNSVATEERKPABVLIVEGQQYAWGTVIiLL 
I RI I LE YCQG VDNr PSVTTDMLTRLSDLIiKYFNSRS CQLVLGAG 
ALQWGLKTITTKNIiALSSRCLQLI VH YI PVIRAHFEARLPPKQ 
YSMLRHFDHITKDYHDHI AE I SAKLVAIMDSLFDKLLSKYEVKA 
PVPSACFRNICKQMT KMHBAI FDLLPEEQTQMLFLRINAS YKLH 
LKKQLSHLNVJ NDGG PQNGLVTADVAFYTGNLQALKGIiKDLDIiN 
MAEIWEQKR 


6279 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTIi 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGBDDDELLAMA 
AESl^NSEVVMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEELRR 
LSVtSG I PKPVRPMTWKLLSGYLPANVDRRPATLQRKQK3 YFAFI 
EHYYDSRNDEVHQDTYRQIHIDIPRMSPEALIEiQPKVTEIFERr 
LFIWAIRHPASGYVQGINDLVTPFFWFICEYIEAEEVDTVDVS 
GVPAE VLCNI EADT YWCMSKLLDG IQDN YTFAQPGI QMKVKMLE 
ELVSR IDEQVHRHLDQHE VR YLQFAFRWMNWLLMREVPLRCTI R 
L WDTYQSE PDGFS HFH L YVCAAFL VRWRKE I LEEKD FQE LLLFL 
QWLPTAHWDDEDISLIiLAEAYRLKFAFADAPNHYKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDP<2PfiT.DOT3f3Ar3T , T?OGi?r»E , ccci-i-ct — 
DVDLAQVIiAYLLRRGQVRLVQGGGAANLQFI QALUDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELEFNEIKTQVELATGQIiGLRRAA 
QKHSFPRMLHQRERGLCHRGS FSLGEQSRVI SHFLPNDLGFTDS 
YSQKAFCG I YSKDGQ I FMSACQDQTI RLYDCRYGRFRKFKS I KA 
RDVGWS VLDVAFTPDGNHFL YSSWSDYIHI CN I YGF.GDTHTALD 
LRPDERRFAVFSIAVSSDGRF^I^SGANDGCLYVFDREQNRRTLQ 
lESHEDDVNAVAFADlSSQILFSGGDDAICKVtTORRTMREDDPK 
PVGAIiAGHQDGITFI DS KGDARYL I SNSKDQTIKLWDIRRFSSR 
BGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVVVYDLLSGHIVKK 
LTMHKACVRDVS WHP FE E KI VS S S WDGNLRLWQ YRQAE YFQDDM 
PBSEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSKWSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQF IQALLD SEE ENDRA 
WDGRLGDRYNPPVDATPDTRELEFNEIKTQVBLATGQLGliRRAA 
QKHS F PRMLHQRE RGLCHRGS FS LGEQ S RVISH FLPNDLG FTDS 
YSQKAFCGI YSKDGQIFMSACQDQTIRLY0CRYGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTIKLWDIRRFSSR 



479 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nuc 1 cot i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami.no acxd segment containing signal peptide" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
G VLHTL I RCRFSP I HS TGQQFI YSG CS TG KVW YD LL SGH I VKK 
LTNHKACVRDVSWHPFEEKIVSSSWDGNIiRLWQYRQAEYFQDDM 
PES EECAS APAPVPQS STPFSS PQ 


6282 


12S 


906 


RMAACRALKAVLVDLSGTLHIEDAAVPGAQEAIjKRLRGASVIIR i 
FVTNTTKESKQDLLERLRKLEFDISEDEIFTSLTAARSLLERKQ 
VR PMLLVDDRALPDFKG I QTSDPNAWMGLA P EHFH YQ I LNQAF 
RLLLDGAPLIAIHKARYYKRKDGLALGPGPFVTAIiEYATDTKAT 
VVG KPEKTF F LEALRGTGCEPEEAVM IGDDCRDDVGGAQDVGML 
GI LVKTGKYRASDEEKI NP PPYLTCESFPHAVDHI LQHLL | 


6283 


140 


1043 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP J 
KKQLPS IPKNALPITKPTS PAPAAQSTNGTHASYGPFYLEYSLL 
AE FTLWKQKL PG VYVQP5 YRSALMWFGVI FIRHGLYQDGVFKF 
TVYIPDNYPDGDCPRLVFDIPVFHPIiVDPTSGEIiDVKRAFAKWR 
RNHNHIWQVLMYARRVFYKIDTASPLNPEAAVLYEKDIQLFKSK 
WDS VK VCTARL FDQ P K I EDP YA I S FS PWNPS VHDE ARE KMLTQ 
KKKPESQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT | 


6284 


1 


2879 


RS VI PGST1 S S RWPGLSRPRFMAAHE WDW FQRE EL I GQ I S D X RV I 
QNLQ VE RENVQKRTFTR W I NLHLEKCNP PLE VKDLF VD I ODG K I 
LMALLEVIiSGRNIiLHEYKSSSHRIFRIaNNIAKALKFLEDSNVKL 
VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
LiAPGSGGTDSDSSFPPTPTAERSVAI S VKDQRKAIKALIiAWVQR 
KTRK YG VAVQDFAGS W R SG LAFLAV I KAI DPSLVDM KQALENS T 
RENLE KAFS I AQDALH I PRLL E P ED I MVDTPDEQS I MT YVAQ FL 
ERFPELEAEDIFDSDKEVPIESTFVRIKETPSEQESKVFVLTEN 
GERTYTVNHETSHP PPS KVFVCDKPES MKEFRLDGVSSHALSDS 
STEFMHQIIDQVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 
LP I KKT VHFEADTYKDPFCS KNLS LCFEGS PRVAKES LRQDGHV 
LAVEVAEEKEQXQESSKI PESSSDKVAGDI FLVEGTNNNSQSSS 
CNGALE STARHDE ES HS LS P PGENTVMADS FQI KVNIiMTVE ALE 
EGD YFEAI PLKASKFNSDLIDFASTSQAFNKVPS PHETKPDEDA 
EAFENHAEKLGKRS IKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEBPEGYMPDLD5RE 
EEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
LAPHBDHQQRE TKEND PMDSHQSQES PNLENI ANPLEENVTKES 
ISSKKKEKRKHVDHVESSLFVAPGS VQS SDDLEEDSSD YS I PSR 
TSHSDSS I YLRRHTHRSSESDHFSLCSVEERSRSG | 


6285 


2157 


1331 


SCKTENLLEMWWFQQGLS FLPSALVI WTS AAFI FS Y I TAVTLHuH 
I DPALP Y I S DTGTVAPE KCLFGAMLNI AAVLC IAT I YVRYKQ VH 
ALSPEENVIIKLNKAGLVLGILSCLGLS I VANFQKTTLFAAHVS 
GAVLT FGMGSLYMF VQT I LS YQMQ P K I HG KQVFW I RLLLVI WCG 
VS ALSMLTCS S VLHSGNFGTDLEQKLHWNP ED KG YVLHM I TTAA 

EWSMSFSFFGFFLTYIRDFQKJSLRVEANLHGLTLYDTAPCPIN 1 
NERTRLLSRDI 


6286 _ 


1619 


27£ 


KAGASCCGSANPWSVGKSCVLIAMAQLQTRFYTDNKKYAVDDvH 
PFSIPAASEIADLSNI INKLLKDKNE FHKHVE FDFL I KGQFLRM 

PLDKHMEMENISSEEWEIEYVEKYTAPQPEQCMFHDDWISSIK 
GAEE W I LTGS YDKTSR I WSLEGKS IMT I VGHTD WKDVAWVKKD 
SLSCLLLSASMDQTILLWEWNVBRNKVKALHCCRGHAGSVDS IA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEBDEMEESTNRPRKKQKT 
EQLGLTRTPI VTLSGHMEAVSS VLWS DAE E I CSAS WDHT IRVWD 
VESGSLKS TLTGNKVFNC I S YS PLCKRLASG S TDRHIRLWDPRT 
KDGSLVSLSLTSHTGWVTS VKWSPTHEQQLI SGSLDNI VKLWDT 
RS CKAPLYDLAAHED KVLS VDWTDTG LLLSGGADNKL YS YR YS P 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 

1 amino acid 

j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N-Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-^o&5iDie nucieotiae insertion) 

TTSHVGA 


6287 
6288 


278 

I V 


1482 


mqfffnfqiglrstsgkekysgdag^Lgdalqlflqclaldedp 

APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 
VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNEDGR 
NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
P VTTPCGHS FCKNCLERCLDHAP YCPLCKES LKE YLADRR YC VT 
QLIjEELIVKYLPDELSERKKIYDEETAELSHLTKNVPIFVCTMA 
YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 
GCMLQ I RNVH FLPD GRS WDT VGGKRFR VLKRGMKDG YCTAD I E 
YLEDV 


6289 


j 1 


743 


VTL Y PCRGLVGNLLLGASGMAS GCK I G PS I LNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 
HMM VSKPEQWVKPMAVAGANQ YTFHLEATENPGALI KDIREKGM • 
KVG LAI KPGTS VE YLAP WANQ I DMAL VMT VE PGFGGQ KFMEDMM 
PKVHWIiRTQFPSLDI EVDGGVGPDTVHKCAEAGANM IVSG SAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


1 1 


743 


VTLYPCRGLVGNLLLGASGMASGCXIGPSILNSDLANLGAECLR 
MLDSGADYiHbDVMDGHFVPNITFGHPWESLRKOLGOOPFFDM 
HMMVS KP EQWVKPMAVAGANQ YT FHLEATENPGALI KD I RF.NGM 
KVGIAIKPGTSVEYIAPWANQIDMALVMTVKPGFGGQXFMEDMM 
PKVHWLRTQFP SLDI EVDGGVG PDTVHKCAEAGANMI VSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6291 


3 


1856 


XLGRWLLGVYETVAPTLACLPRPRLRRRRRRRRRRMISRYTRKA 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
DITRESSFTSADTGNSLSAFPSYTGAGISTEGSSDFSWGYGELD 
QNATEKVQTMFTAIDELLYEQKLSVHTKSLQEECQQWTASFPHL 
R1LGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFSSS YAHKASS I AXS S S FCSME RDEEDS 1 1 VSEG I IEE Y 

LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDSESSCV 
LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 
MPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLSYTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEILRGA 
R VP VAP DS LS S PS PTPLS RNNLL P p I GTAE VEHVS TVG PQRQM K 
yu6 a «*Ay AWDE PN YQQPQERLLLPDF FPR PNTTQ S FLLDT 
QYRRS CAVE YPHQARPGRG SAG PQLHGS TKS QS GGRP VSRTRQG 


6292 


1732 


602 


i,VAKMASSASARTPAGKRVINQEELRRLMKEKQRLSTSRKRIES 
P FAK YNRLGQLS CAL CNTP VKS ELLWQTHVLGKQHRE KVAELKG 

AKEASQGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
«*w x unrujMw&cr 1KATPSKPSGLSLLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPBVDARV 
RKVDAPKDQMDKEWDEFQKAMRQVNTI SEAI VAEEDEEGRLDRQ 
IGEIDEQIECYRRVEKLRNRQDEIKNKLKEILTIKELQKKEEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6293 


1835 


1142 


rcPGAMKMVAPWTRFYSNSCCLCCHVRTGTILLGVWYHlNAW 
CiL I LL S ALADPD Q YNF S S S ELGGDFEFMDDANMC I AI A I SLLM I 
Ei ICAMATYGAYKQRAAW 1 1 PFFC YQI FDFALNMLVAI TVL I YPN 
SIQEYIRQLPPNFPYRDDVMSVNPTCLVLIILLFISIILTFKGY 
^CV^CYRYINGRNSSDVLVYVTSNDTTVLLPPYDDATVNGA 




2382 


1035 J| 
I 


' we TLGT VD VH P IG W CA I NS K I L VP PRT IHAKFTBW KG YLMKRL 
fGS RTLPVD FHIKMVESMKYPFRQGMRLE VVDKSQVS RTRMAW 
>TVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

-L Cb-LUUC OJl 

amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C»Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S^Serine, T^Threonine, V=Valine, \ 
"^Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, ? 
\ -possible nucleotide insertion) j 


6^294 






MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK " 
LEAIDPLNLGNICVAT VCKVLLDGYLM I CVDGGPSTDGLDWFC Y 
HASSHAI FPAT FCQKND I ELT P P KG YE AQTFNWENYLEKTKS KA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRWHRL 
LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKXPLLEDD 
PQGARKISS E PVPGE I IAVRVKEEHLDYASPDKASS PELPVSVE 
NIKQETDD | 




354 


1814 


AQLTTRGRTVAGGVRWIPSPFPDLELYSCCLGTDRGFPELSHHC 
KNV I ATAS DYDMAE I TNI R PS FDVS P WAGLIGAS VL WCVS VT 
VFVWSCCHQQAEKKHKNPPYKFIHMLKGISIYPETLSNKKKIIK 
VRRD KDG PGREGGRRNLLVDAAEAGLLSR D KD PRGPS SGS C I DQ 
LPIKMDYGEELRSPITSIjTPGESKTTSPSSPEEDVMLGSLTFSV 
DYNFPKKALWTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 
RDDVIGEVMVPLAGVDPSTGKVQLTRDI I KRNIQKCISRGELQV 
SLS YQPVAQRMTVWLKARHLQKMD IAGLSGNP YVKVNVYYGRK 
R I AKKKTHVKKCTLNP I FNESFIYDI PTDLLPDISIEFLVIDFD 
RTTKNFAA/GRLIIX3MSVTASGAEHWREVCESPRKPVAK;7HSLS 


6295 


2795 


617 


VS S ALLTGATSGS DAAKS EGAS AS PLS CTNAVAMDR PDEG P PAK [ 
TRRLSSSESPQRDPPPPPppppLLRIiPLPPPQQRPRLQEETEAA 
QVLADMRGVGLGPALPP P PPYVI LEEGGIRAYFTLGAECPGWDS 
T I E SG YGEAP P PTES LE ALPTPEASGGS LE IDFQ WQ SS S FGGE 
GALETCS AVGWAPQRLVDP KS KEEAI 1 1 VEDEDEDERES MRS S R 
RRRRRRRRKQR KVKR E S RERNAERMES ILQALEDI QLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFIiERRDLiI IQHI PGFWVKAFLNHPR 
I S I L INRRD ED I FRYT/TNLQ VQDLRH I SMGYKMKL YFQTN P YFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHS LPEADR IAE 1 1 KNDL WVNPLR Y YLR ERGSR I KRKK 
QEMKKRKTRGRCE WI MEDAPDY YAVEDI FS E I S D I DET IHD I K 
I SDFMETTDYFETTDNEI TD I ME N I CDS ENPDHNEVPNNETTDN 
NE SADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNE1TTY 
GNNFFKGG FWGSHGWNQDS S DS DNEADE ASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRDIE YYEKVI EDFDKDQADYEDVI E 1 1 

SDESVEEEGIEEGIQQDEDIYEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNGWANPGKRGKTG J 




727 


1199 


RHCGCDAQGACDSIjPPTGTS S PVTARNA I PEARCC WLLDGTTV I 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSLQFPSPFSGTISFGSFSDS5IFPLGSQCCLGFQQFS1SGK 
KWAL IHKRVRLS VFGARWGRI YFGK ) 


6297 


1 


922 


QRAAAAS PSSCGPRGAEYGALMAMEGYWRFLALLGSAliLVGFLS 1 
V I FAL VW VLHYREGLG WDGS ALE FNWHP VLMVTG F VF I QG I AI I 
VYRLPWTWKCSKLLMKSIHAGLNAVAAILAIISWAVPENHNVN 
NIANMYSLHS WVGLIAVI CYLLQLLSGFSVFLLPWAPLSLRAFL 
MP IH VYSG I VI FGTVI ATALMGLTEKL I FSLRDPAYSTFPPEG V 
F VNTLGLL I L VFGAL I FW I VTR PQ WKRP KE PNSTI LHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNNEVAARKRNIALDEAGQRSTM I 


6298 


3 


985 

\ 
1 
] 
( 


s v pijRRLslsgtlgx3AgtttkmavarijAavaawvpcrs wg waSv j 

PFGPHRGIiSVLLARIPQRAPRWLPACROKTSLSFLNRPDLPNLA 
If KKLKGKS PGIIFI PGYLSYMNGTKALAI EEFCKSLGHACI RFD 
ITS G VGS S DGNS E ES TLGKWRKDVLS 1 1 DDLADGPQ I L VG S S LGG 
tfLMLHAAI ARPEKWAL IGVATAADTLVTKFNQLP VELKKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRL 
^HGMKDDIVPWHTSMQVADRVLSTDVDVILRKHSDHRMREKADI 
3LLVYTIDDLIDKLSTIVN | 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i Predicted end 
| nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


^ c»w*%* ocjjiucwi. ^uau«iiiing sxgnaj. peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, P= Phenyl alanine, G=*Glycine, 
H=Histidine, I=Isoleucine, K*=Iiysine, 
L=Leucine, M=Methionine, N=*Asparagine r 

P=PrOllne , 0=GllI tamirtf R-&rnininp 

SsSerine, T-Threonine, VsrValine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleoH r?#» »t-<ij~tn 
\=possible nucleotide insertion) 


6299 


512 


| 814 


BCDIjEGIMPNVTISLSIjPTl^gPIjOnTT.VHPr'VT'gT ticWtt iroo — 
«"<7Ajr' xr xjv J- / X-Uvr1L > v_V Io-LiDoAA LjTSS 

S I DAMDD S AFS G P YKF P FTPP LE S FNLCFYTSQ VP VP P I LG FYQ 

MKEEEVQLRNNH 


6300 


121 


692 


AAPS CWSQKC5 V PAAGTPS S PRLLVS RAAAPSAG P WGAWRQGARA 

AQSPFSIPNSSSVPYGSQDSVHSSPEDGGGGRDRPVGGSPGGPR 

LVIGSLPAHLSPHMFGGPKCPVCSKFVSSDEMDLHLVMCLTKPR 

ITYNEDVJLSKDAGECAICLEELQQGDTIARLPCIiCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKYIjRCYRCLLETKELGCLLGSDICLTP 
AGSSCITLHKKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGFW ' 
IFSQYCFLDFCNDPQKRGLYTP 


6302 


490 


745 


IFGFLHLFHMEHSFLLVCALFAHVFFSSSCGSSVALHSDPCriLS 
PVLLNCIiPGDLRPIaDELYAQKLKYKAISEELDHALNDMTSL 


6303 


2 


1951 


YWNEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL 
YWYYLEQFQYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 
KVDLVS FLSS P IMGDNDSSGTSDKDHS EILDGI SNI KLNSEEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTWTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PSKLKRSHELD IDENPASDFDDSGSLLGFKYGSGQKYGGI PNFS 
HRQ VR YLEKKVKb KSK YLDMRRQ IKMKNKHIFFTKES EKP FFKK 
SKILSKVEKFLTWVNKPMDEEASQESSSHDNGHDASrSCDSEEQ 
DMSViCKGDDLLETNNPEPEKCQSVSSAGELETENYERDSHiATV 
PDEODCVTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDGIKLDREGWFSVTPEKIAEHIAG^l 
VSQS FKCDVWDAFCGVGGNT I QFALTGMR VI AI D I DP VK I ALA 
RNNAE VYGI ADK I E FI CGDFLLLAS FLKAD WFLS P P WGGPD YA 
1AB 1 tUAKA WMi i?lX5 FE I FRlrS KK1 TNN I VYFLPRNADI DQ VAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


HRARVDRSRES PGGDLRH PGRVRRD I T1»SGH PR LS TQH WLLRE 
vCsVijUfKjil ^XiGHPQHGSPIQETQSEVVTJjVSPLPGSDMAALPA 
WRATSGLTLWPHTAEGRDLLGAENRALTGGQQAEDPTI*ASGAYQ 
WPGSVEKLQGSWJCDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 
AQGEWDKARVPAHGQVLQVGFSTEAALQDLSSPRLSQLCSQGL 
aSLIKRPGDLPEVLSFHVDRVLGLRRSLPAVARRFHSPIiLPYRY 
TDGGAR PVI WWAPDVQHLSDPDEDQNSIiALGWLQ YQALLAHSCN 
WPGQAPCPGIHHTEWARLALFDFLLQVHDRLDRYCCGFEPEPSD 
PC VEERLREKCRNP AE LRLVH I LVRS S D PSHL VY I DNAGNLQH P 
EDKLN FRLLEG I DG F PES AVKVLASGCLQNM t*L KSLQMDPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


33 


420 


WM I YiRGRS TYRPRPRRS VPPPEL IGPMLEPGDEEPQQEEPPTES 
RDPAPGQEREEDQGAAETQVPDLEADLQELSQSKTGDECGDGPD 
VQGKILTKSEQFKMPEGR 


6306 


1 


1874 

] 


PTRP3KVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSBADKSALMD 
ESSDSGVIPGSHSENALHASEEEEGEGGKAQSSLGYIPLMRWQ 
S VRHTTRKS S TTLREG WWHYSNKDTLRKRHYWRLDCKC I TL FQ 
NNTTNR Y YKE I PLSE I LTVES AQNFSLVPPGTNPHCFE I VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQASLSISVSNSQIQENVDIATVYQIFPDEVLGSGQF 
GWYGGKHR KTGRDVAVKVIDKLR FPTKQESQLRKE VAI LQS LR 
FTPGIVNLECMFETPEKVFWMEKLHGDMLBMrLSSEKGRLPERL 
IKF h I TQ I I/VA LRHLH FKNI VHCDLKP ENVLLAS ADP FPQVKLC 
DFGFAR1 IGEKS FRRS WGTPAYUVPEVLLNQG YNRSLDMWS VG 
yiMYVSLSGTFPFNEDEDINDQIQNAAFMYPASPWSHISAGAID 
LINNLLQVKMRKRYSVDKSLSHPWI4QEYQTWI1DLRELEGKMGER 
iflTHESDDARWEQFAAEHPLPGSGLPTDRDIiGGACPPQDHDMQG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


J Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G^Glycine, 
H=Histidine, I=*Isoleucine, K=*Lysine, 
L~Leucine, M*Methionine, N=Asparagine, 
P^Proline, Q=Glut amine, ReArginine, 
S*Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 

IiAERISVL ' ~ — — - 


6307 
6308 " 


2136 


589 


CFLLPRGRuyfiPPEAGAAAPCAPGAPDMSFRKVVRQSKFRHVFG ' 

QPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFL 

VLPLSKTGRIDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDC 

TVMVWQIPENGLTSPLTEPWVLEGHTKRVGIIAWHPTARNVLL 

SAG CDNWL I WNVGTAEEL YRLDS LHPDL I YNVS WNHNGS LFCS 

ACKDKSVR I IDPRRGTLVAEREKAHEGARPMRAX FLADGKVFTT 

GFSRMSERQLAiWDPENLEEPMALQELDSSNGAIiLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQRGMGSMPKR 
1 GXiEVS KCE T AR F YKLiHKP TCP PPT VM*P\f ra? vrnt Pr , nr , T ^ 

PEAALEAEE WVSGRDADP I LI SLREAYVPSKQRDLKIS RRNVLS 

DSRPAMAPGSSHLGAPASTTTAADATPSGSLARAGEAGKLEEVM 
QELRALRALVKEQGDRICRLEEOLGRMFNrna 


6309 


2 


1118 


GRPTR PEKMLL S LVLHT YS M R YLL PS VVLLGTAPT YVLAWG VWR 
LLSAFLPARFYQALDDRLYCVYQSMVIiFFFENYTGVQILLYGDL 
PKNKENIIYI^NHQSTVDWIVADILAIRQNALGHVRYVLKEGLK 
WLPLYGWYFAQHGGIYVKRSAKFNEKEMRNKLQSYVDAGTPMYL 
| VIFPEGTR YNPEQTKVL S ASQAFAAQRGLAVL KHVLTPR I KATH 
VAFDCMKNYLDAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 
KIHI H I DRI DKKDVPE EQEHMRRWLHER FEI KDKMI»I E F YES P D 

PERRIQIFPGKSVNSKT.c; TTTVTT^DQMT TT em tn«AiM> 

rujV3 vnar^Lta Xr<^.LLtf OMJjXla^QJJjTAGMijMTDAGRKti 

YVNTWI YGTLLGCLWVTI KA 


6310 


220 


563 


LVAEVKEPCTSLPMLSVDMEWKENGSVGVKNSMENGRPPDPADWA 
VMDVWYFRTVGFEEQASAFQEQEIDGKSLLLMTRNDVLTGLQL 
KLGP AI»K I YE YH VKP LQTKHLKNNS S 


6311 


36 


979 

675 ! 


GPRCWKFLIX J SSVNCETI,RIGKAWPQSSGQERYWTPRTHSSAS5"~ 

AORGSI±A3LWV7lAAf!r.W2\rir»rvnoT vnnnu^r -, . „,.. , 

jjrtwjjiN v/iMA^ijWAiJL-jjy PJjxDCPMCGLI CTN YH I L QEHV 

DLHLEENS FQQGMD R VQCSGDLQLtAlIQLQQEEDR KRRS EES RQE 

I EEFQ KLQRQ YGLDNSGG Y KQ QQLRNME I E VNRGRM P P S E FHRR 

KADMME SIiALG FDDGKTKTS G 1 1 EALHR YYQNAATDVRRVWIiS S 

WDHFHSSLGDKGWGCXSYRNFQMLLSSIiLQNDAYNDCI.KGMLIP 

CIPKIQSMIEDAWKEGFDPQGASQLIIRLQGTKAWIGACEVYIL 
LTSLRV 


6312 


1 




P VWWNS CEG PRLAAAARTGHG VG RRARLACLGEPR VKAAVP^LTIi 

ASKLKRDDGLKGSRTAATASDSTRRVSVRDKLLVKEVAELEANL 
PCTCKVHF PDPNKLHCFOLTVTPDEG Y YOno (TPnFPT^/Dn * v ■» 

MVPPKVKCLTKlWHPWITETGEICLSLLREHSIDGTGWAPTRTIi 
^^^GLNSLFTDLLNFDDPLNIEAAEHHLRDKEDFRNXVDDYI 


6313 


213 


1400 


^Diiii.VKREAGMKMLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTEEEAKQLAEEMNIAFYTSRTDDIIjLHQDVDLVCIS I PPPLT 
RQISVKAIX5IGKNVVCEKAATSVDAFRMVTASRYYPQLMSLVGN 
VLRFLPAFVRMKQLISEHYVGAVMICDARIYSGSLLSPSYGWIC 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
IRGIRHVTSDDFCFFQMLMGGGVCSTVTriNFNMPGAFVHEVMVV 
GSAGRLVARGADLYGQKNS ATQ EELLLRDSIjAVGAGLP EQGPQD 
VPLLYLKGM VVM VQALRQS FQGQGDRRTWDRTP VSMAAS FEDGL 
yMQSWDAIKRSSRSGEWEAVEVLTEEPDTNQNLCEALOPKrMT. 




2 


2071 J 

] 

] 

1 

1 C 


aRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFWVFMPLGVL " 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
^NEQREQDR FTDI TIi I VDGHH FKAHKAVLAACS KFF Y KFFQ EF 

rQEPLVEIEGVSKMAFRHLIEFTYTAKLMlQGEEEANDVWKAAE 
"LQMLEAI KALEVRNKENSAPLEENTTGKNBAKXRKIAETSNVI 
rESLPSAESEPVEIEVElAEGTIEVEDEGIETLEEVASAKQSVK 
f IQS TGS SDDSALALIAD I TSK YRQGDRKG Q I KEDGCPS D PTS K 

3VEGIEIVELQLSHVKDLFHCBKCNRSFKLFYHFKEHMKSHSTE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid . 
sequence 


Ammo acid segment containing signal peptide 
(A« Alanine, C=»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=sGlutamine, R=Arginine, 
S«Serine, T^Threonine , V= Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








SFKCEICNKRYLRESAWKQHIiNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMT1IEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTBLLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RLNEQREQDR FTDI TL I VDGHH FKAHKAVLAACS KFF YKF FQEF 
TQEPL VE I EGVS KMAFRH LIE FT YTAKLMI QGE E EAN D VW KAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSAIiALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ ' 
TEP VTSMTI I EQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMS ELP EQVQVS YLE VGR IQTEEGTEVHVE ELHVER VNQ 
MPVEVQTELLEADLDHVTPE IMNQBERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 


LGLAVNWl'TL VLI SYCPTATEEAPYttfTYLLCALGLFIYQSLDA 
IDGKQARRTNS CSPLGELFDHGCDSLS TVFMAVGAS IAARLGTY 
PDWFF S CS F IGMF VF YCAH WQTYVSGMLRFGKVD VTE IQ I AJUVI 
VFVLSAFGGATMWDYTIPI LEI KLK I LPVLGFLGGVI FSCSNYF 
HVILHGGVGKNGS T I AGTS VLS PGL H I G L 1 1 1 LA I M I YKKS ATD 
VFEKHPCLYILMFGCVFAKVSQKLWAHMTKSELYLQDTVFLGP 
GLLFLDQYFNNFIDE YWLWMAMVISS FDMVI YFSALCLQ ISRH 
LHLNI FKTACHQAPEQVQVLSS KSHQNWMD 


vJiO 


1503 


792 


vsagagtgxmggttstrrv*feadenenitwkg±rlsenVidr 

MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEBLALEQAKKES 

edqkrlkqakeldreraaaneqltrailrericseeerakakhl 

ARQLEEKDRVLKKQDAFYKEQLARLEERSSE FYRVTTEQ YQKAA 

EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLE KGG 


6317 


102 


839 


PEAQTSAVIJ^RBKGHLPTMRHEAPMQW^AQDARYGQKDSSDQN 
FDYMFKLLIIGNSSVGKTSFLFRYADDSFTSAFVSTVGIDFKVK 
TVFKNEKR I KLQ I WDTAGQERYRT ITTAY YRGAMG F 1 LMYD I TN 
EES FNAVQD WS TQ I KT YSWDNAQVILVGNKCDME DERV I STERG 
QHLGEQLGF EFFETS AKDNI NVKQTFERLVD 1 1 CD KMSES LE TD 
PA I TAAKQNTR LKETP PPPQ pncac 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEf?RPr:anQMQUT.Dr2T pi DDD&nm — 
LG P LLS P F PLPAGS WHRQMLRSSLRFPI TNS AGA PC KAAGRMNI 
IAP\mRDRVLAELPQCLRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLGI PFSLQLWDTAGQERFKCIASTYYRGAQAI I IVFNLN 
D VAS LEHT KQ WLADALKENDP5 S VLLFLVGS KKDLSTPAQ YALM 
EKDALOVAQEMKAEYWAVSSLTGENVREFFFRVAALTFEANVLA 
E LEKSG AR R IGDWR I NSDDS NL YLTAS XKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLIXSKKVVLVPYTSEHVPSRYkEWMKSEELQRLT 
AS EPLTLEQE YAMQ CS WQEDADKCT F I VLDAE KWQ AQPGATEES 
CMVGDVNLFLTDLEDLTLGE I E VM I AE PSCRGKGLGTE A VLAM L 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
t=s i tine or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q^Glutamine , R^Arginine, 
S=Serine, ^-Threonine, V=Valine, 
W«Tryptophan, Y*Tyrosine, X*= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESEHQWIjLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


rprtgrekvamaavdsfyllyreiarscncymeaialvgawyta 
rksitvicdfyslirlhfiprlgsradlikqygrwawsgatdg 
ig ka yaeelas rgln i ilisrneeklqvvakdiadtykvetdi i 
vadfssgre i ylp irealkdkdvgi lvnnvgvfypypq yftqls 
edklwd i invniaaaslmvhwlpgmverkkgai vtissgscck 
ptpqlaafsaskayxdhfsralqyeyaskgi fvqslipfyvats 
mtapsnflhrcs wlvps pkvyahhavstlg is krttgywshs iq 
flfaqympewlwvwganilnrslrkealscta 


6321 


1418 


341 


HRKAAJ^ALMAGRLLGKALAAVSLSLALASVTIRSSRCRGIQAF 
RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNFSPK 
FNEKDGHVERKSKNGLYE I ENGR PRN P AGRTGLVGRGIi LGRWGP 
NHAADPIITRWKRDSSGNKIMHPVSGKHILQFVAIKRKDCGEWA 
I PGGMVDPGE KI S ATL KRE FGEBALNSLQ KTS AEKRE I E E KJLHK 
LFS QDHLVI YKGYVDDPRNTDNAWMETEAVNYHDETGE IMDNLM 
LiEAGDDAGKVKWVD INDKLKL YAS HSQFI KLVAEKRDAHWS EDS 
EADCHAL 


6322 


2047 


1083 


NQEILKNVESSRTVQPHFLEFLLSLGWSVDVGRHPGWTGHVSTS 
WS IN CCDDGEGS QQEE VI SSEDIGASI FNGQKKVL YYADALTE I 
AFWPSPVESLTDSLESNlSDQDSDSNMDLMPGILKQPSLTbEI. 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSWWVE 
RYDDI ENFPLS ELMTE ISTGVETTANSSTSLRSTTLEKE VPVI F 
IHPLNTGLFRI KIQGATGKFNMVI PLVDGMI VSRRALGFLVRQT 
VIN I CRRKRIiES DS YS P PHVRRKQ K I TD IVNKYRNKQLE PEF YT 
SLFQEVGIiKNCSS 


6323 


1 


656 


PASTTDGAQE AR VPLDG AF W I PRP PAGSPKGCFAC VS KPPAliQA" 
PAAPAPEPSASPPMAPTLFPMESKSSKTDSVRAAGAPPACKHLA 
EKKTMTNPTTVIEVYPDTTEVNDYYLWSIFNFVYLNFCCIiGFIA 
LAYSLKVRDKKLLTTOIiNGAVEDAKTDRIilNITRSGLJ^ASCIMLW 
MALSVIATHRGLRSSASILVAEPKDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGUBAPRGGRRRQPGQQ 
RPG PGAGA PAG R PEGGG PW ARTEGS S LHS E PERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEASPWTQPGVHGP 
WTELETHGSQTQPERVKSWADNLWTHQNSSSLQTHPEGACPSKE 
PSADGSWKELYTDGSRTQQDIEGPWTEPYTDGSQKKQDTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFSSAS S FDESEDD WAGGGGASDPEDRSGSKPWKKL 
KTVLKYS PF WS FRKHYP WQLSGHAGNPQAGEDGRI LKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLQDGQTFNQMEDIiLADFEGP 
S IMDCKMGS RTYLEEEIj VKARERPRPR KDM YE KMVAVDPGAPTP 
E EHAQGAVTKP R YMQWRETMSS TSTLG FR I EG I KKADGTCNTN F 
KKTQALEQVTKVLEDFVDGDHVTT^JlfYVaPT.WPT i3T?n t otot»dc» 
KTHEWGSSLLFVHDHTGLAKVWM I DFGKTVALPDHQTLSHRLP 
WAEGNREDGYLWGLDNMICLLQGIAQS 


6325 


165 


944 


GLRDPFRRKRRLKPQVKMSNYVNDMWPGSPQEKDS PS TSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
R KYRR YSRS YS RS RSRSRSRRYRERR YG FTRRYYRS P SR YRSRS 
RS R S RS RGRS YCGRA YAI ARGQR YYG FGRTV Y PEEHS RWRDRSR 
TRSRSRTPFRLSEKDRMELLEIAKTNAAKALGTTNIDLPASLRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 

< 


GEPS PATQQKPS ATGAGVLHQHFS SGH I YVLMGLL P P PWT I S FT 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRTQTL 
3ARRVAAQGARPl,PEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, RsArginine, 
SeSerine, T=Threonine, V=Valine, 
W*=Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








QAWGGVGQEASSGVP 


6327 


1 


1337 


S LARLAP AGGS WM P TQQPAAP STRAPKPSRSXjSG S LCALFS DA 
DSGSGMKAELPPGPGAVGREMTKEEKLQLRKEKKQQKKKRKEEK 
GAEPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVX>DLLLRRLVKKPERQQVPTRKDYGSKVSIiFSHLPQYS 
RQNSLTQFMS I PS SVIHPAM VRLGIjQ YSQGL VRGSKARCI AJULR 
ALQQVIQDYTTPPNEELiSRDLVNKLKPYMSFLTQCRPLSASMHN 
AI KFLNKE ITS VGSSKREEEAKSELRAAIDRY VQEKI VLAAQAI 
SRFAYQKI SNGDVI L VYGCS S LVSRI LQEAWTEGRR FRWVVDS 
RPWLEGRHTLRSLVHAGVPAS YLLIPAAS YVLPEVS TEEKDSKV 
GGEKV 


6328 


1030 


276 


HASAE VTTAAARGLGAMEEEMHTDAKIRAENGTGSS PRGPGCS L 
RHFACEQNLLSRPDGSASFLQGDTSVLAGVYGPAEVKVSKE1FN 
KATLEVILRPKIGLPGVAEKSRERLIRNTCEAWLGTLH PRTS I 
TWDQWS DAG S LIiAC CIJ^AACMAL VDAG VPMRALFCG VACALD 
S DGTLYLD P TS KQE KE ARAVLT FALDS VERKLLMS S T KGI/YS DT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


6329 


3 


2016 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGLNNGAGGTSATT 
SNPLSRKLHKILETRLDNDKEMIiEALKALSTFFVENSLRTRRNL 
RGDI ERKS LAINEE FVS I FKEVKEELES IS EDVQAMSNCCQDMT 
SRJjQAAKEQTQDLIVKTTKLiQSESQICLEIRAQVADAFIjS kfqlt 
SDEMS LLRGTREGP I TEDFFKALGRVKQIHNDVKVLLRTNQQTA 
GLEIMEQMALLQETAYERLYRWAQSECRTLTQESCDVSPVIiTQA 
MEALQDRP VLYKYTLDEFGTARRSTWRGF I DALTRGGPGGTPR 
P I EMHSHDPLR YVGDMLAWDHQATASE KEHLEALLKHVTTQGVE 
ENIQEVVGH I TE<3 VCR PLKVRIEQ VI VAEPGAVLIiYK I SNIitiKF 
YHHTISGIVGNSATALLTTIEEMHIjLSKKIFFNSLSIjHASKLMD 
KVEL PP PDLGPSSALNQTLMLLRE VLASHDSS WPLDARQADFV 

QVLS CVLD pllqmctvsasnlgtadmatfm VNSIiYMMKTTLALF 
eftdrrlemlqfqieahldtlineqasyvltrvglsyiyntvqq 
hkpeqgslanmpnlds vtlkaamvqfdrylsapdnll I PQLNFL 
lsatvkeqivkqstelvcraygevyaavmnpineykdpenilhr 
spqqvqtlls 


6330 


1151 


333 


ffyytfyenktfsrkmvaeketlslnkcpdkmpkrtkllaqqpl " 
pvhqphslvsegftvkammknswrgppaagafkerptkptafr 
kfyergdfpialehdskgnkiawkveibkldyhhylplffdglc 
ehtfpyeffarqgihdmlehggnkilpvlpqliipiknalnlrn 
RQVI CVTLKVIiQHIjWS AEMVGKALVP yyrqi lp vlni fknmnv 

nsgdgidysqqkrenigdliqetleaferyggenafinikywp 
tyesclln 


6331 


3 


49S 


qqgqrvrtrgrracasatplex3cvdlsyprthaallkvaqmvtl 
liaficvrsslwtnysaysyfewticdlimilafylvhlfrfy 

RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNQSGIjVAGA 
i fgfmatflcmasi wls ykiscvtqstdaav 


6332 


1 


878 


.VTESNKFDJbVSFIPLLRERIYSNNQYARQFilSWILVLE'SVP"DI 
NLLDYLPEILDGLFQILGDNGKEIRKMCEWLGEFLKEIKKNPS 
S VKFAEMANI LVIHCQTTDDIilQLTAMCWMREFIQLAGRVML P Y 
SSGIIiTAVLPCLAYDDRKKS I KE VANVCNQS LM KL VT PEDDE LD 
ELRPGQRQAE PTPDDAL PKQSGTAS GE WTPSLHIjTS CRGPR E PD 
VIGVALGPHLSNQDYFMYVTHTIVAATQRSGSSGSPPFCRQDTG 
KLSTMATHSQLVKTGTGLEPRQAVSSSH 


'6333 " 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLLAPLIPPRSAG^ 
QPLTF S PSGRQ PI»RSLI»VGMCSGSGRRRSSLSPTMRPGTGAERG 
GJUMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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SEQ 
ID 

NO: 


Predi c ted" ~ — 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide - 
<A«Alanine, C=Cysteine, DsAspartic Acid; E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HeHistidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine ( 
PssProline, QsaGlut amine, R=Arginine, 
5»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








GAKSMWTEHKSPDQRTYYYNTBTK05TWRK-pr>nr>VT'r»&K»r>T TC v — 

CPWKSYKSDSGKPYYYNSQTKESRWAKPKELEDLEGYQNTrVAG 
SUTXSNLHAMrfCAEESSKQEECTTTSTAPVPTTEIPTTMSTMA 
AAEAAAAWAAAAAAAAAAAAANANASTSASNTVSGT VPWPE P 
E VTS I VAT WDNENTVT I S TEE OAOLTQTD A t nnnc\rc^re pxtoo 

EETSKQETVADPTPKKEEEESQPAKKTYTWNTKEEAKQAFKELL 
KEKRVPSNAS WEQAMKMI INDPR YSALAKLS EKKQAFNAY KVQT 
EKK 


6334 
6335" 


17 


644 


GGNPSGRAAGFAAAAMPSS PLRVAVVCSSNQNRSMEAHN 1LSKR 
GFSVRSFGTGTHVKLPGPAPDlCPNVvniyirT'T»vnr*N/ivxrnT t nvm, 

EliYTQNGII.HMLDRNKRIKPRPERFQNCKDLFDIilLTCEERVYD 
QWEDLNSREQETCQPVH WNVDIQDNHEEATLGAFL ICELCQC 
I QH TE DMENE I D ELLQE FEE KSGRTFLHT VCF Y 




82 


S29 


KLRQGENIiILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
G PAE I AGLQI GD KI MQ VNGWDMTM VTHDQARKRIiTKRS EEWRL 
LVTRQS LQKAVQQSMLS 


6336 


1003 


438 


HE PAS KGRAK VGNMRLS VAAAISHGRVFRRMGLGPESRIHLLRN 

LLTGLVRHERIEAPWARVDEMRGYAEKLIDYGKLGDTNERAMRM 

ADFWLTEKDLIPKLPQVIiAPRYKDQTGGYTRMLQIPNRSIiDRAK 

MAVIEYKGNCLPPLPLPRRDSHLTLLNQIiLQGLRQDIiRQSOEAS 
NHSSHTAQTPGI 


6337 


76 


524 


EGIQMLSVgPUTKPKGCAGCNRKIKDRYLLKAI^XYWHEDCLKcH 
ACCDCRLGEVGSTTiVf bmt tt roonvr or Tjrt,™,,-,^, , _ 
™- wvuiuMB v\» 1 u x x jvhjnjj J. JjUKKU x LRLFGVTGNCAACSIGL»I 

PAFEM VMRAKDIfVYHLDCFACQLCNQRFCVGD KF FLKNNM I LCO 
TDYEEGLMKEGYAPQVR 


6338 
6339 




1349 


APNSESGTQ<iPliPTPANI,FWTRRANPDPTTSMSATDRMGPKAVP — 
GLRLALLLLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVWAKNY 
KNVFKXYEVl^LYHEPPEDDKASQRQFEMEBLILELAAQVLED 
KGVGFGLVDSEKDAAVAKKLGLTEVDSMYVFKGDEV1EYDGEFS 
ADT I VEFLLDVLEDPVEL I EGERELQAFENI EDE I KLIGYFKS K 
DSEH YKAFEDAAEEFHP Y I PFFATFDS KGAKKLTLKLNE I D FYE 
AFMEEPVTIPDKPNSEEEIWFVEEHPJRSTLRKLKPESMYETWE 
DDMDG I H I VAFAEE AO PDGFE FLETLKAVAQDNTENEDLS 1 1 W I 

DPDDFPLLVPYWEKTFDIDLSAPQIGVVNVTDADRLWMEMDDEE 
DLPSAEELEDWLEDVLEGEINTEDDDDDDDD 


4340 


246 


1813 


NRCDRGGGGaAERQAGQGCRTQGAGPGB-GFGHSFFSQGAMKAFH 
TFCWLLVFGS VSEAKFDDFEDEED I VEYDCNDFAEFED VMEDS 
VTESPQRVIITEDDEDETTVELEGQDENQEGDFBDADTQEGDTE 
SEPYDDEEFEGYEDKPDTSSSKNKDPITIVDVPAHLQNSWESYY 
LEILMVTGLIAYIMNYIIGKNK2^SRTJ\OAWFT3TMPItt r pcmott 

VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCEGMLIQLRFL 
KRQDLLNVLARMMR P VSDQVQ I KVTMNDEDMDT YV FAVGTR KAL 
VR LQKEMQDLSE FCSDKP KSGAKYGLPDSLAILS EMGE VTDGMM 
DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPLKLPDTKR 
TLLLTFNVPGSGNT YP KDME ALL PLMNM V I YS I DKAKKFRLNRE 

GKQKADKNPJ\RVBENFLKLTHVQRQEAAQSRPJSEKKRAEKERIM 
WEEDPEKQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 




2 


583 

: 


KACAHTLS CPAFARLGRARRRP WMSHRTS STFRAERS FHSSSSQ — 
SS S STS SS AS RALPAQD PPME KALSM FSDDFGS FMRPHS BP LAF 
PARPGG AGN I KTLGDAYEFAVD VRDFS PED 1 1 VTTSNNHIEVRA 
5 KLAAIX3TVMNNFAH KCQLPE DVDPTSVTSALRE DGS LTI RARR 
■IPHTEHVQQTFRTEIKI 


6341 


2 

i 


£45" ] 
1 
< 


KMAVLS APGL>KCJFR I LGLRS S VGPAVQARGVHQS VATDGPSSTQ 
3 AL PKARA VA P KP S S RGE YVVAKLD DLVN WARRS S L W PMTFG LA 
:CAVEMMHMAAPRYDMDRFGWFRASFRQSPVMIVAQTLTNKMA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "" 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=rHistidine, I-Isoleucine, K=I,ysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion 
\«possible nucleotide insertion) 








PALRKVYDQMPEPRYWSMGSCANGGGYYHYSYSWRGCDRIVP 
VD I YI PG C P P TAEALL YG I LQLQRK I KRERRLQ I W YRR 


6342 


2 


1191 


DPRVRAMLAriaARVAALRKTCLFSGRGGGRGLWTGRPQSDMNNI 
KPLEGVK I LDLTRVLAGP FATMNLGDLG AE VI KVER PGAGDDTR 
TWGPPPVGTESTYYLSVNRNKKSIAVNIKDPKGVKIIKELAAVC 
DVPVENYVPGKLSAMGLG YEDIDB IAPHI I YCSITGYGQTGPIS 
ORAGYDAVASAVSGIiMHTTGPEVArr.«5HTBanTvr TrnirPuwBtw 

TAHGSIVPYQAFKTKDGYIWGAGNNQQFATVCKILDLPELIDN 
SKYKTNHLRVHNRKELIKILSERFEEELTSKWLYLFEGSGVPYG 
PINNMKNVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
MSEARPFPLLGQHTTHILKEVLRYDDRAIGELLSAGWDQHETH 


6343 


2 


936 


GTAM VSDEDELNLLV1 WDANP I W WGKQALKES Q FTLS KCI DAV ' " 
MVLGNSHI* FMMRSNKIiAVI ASH I Q ES RFL YPGKNGRLGD F FGEP 
GNPPE FNPSGS KDGKYE LLTSANE VI VEE I KDLMTKS D I KGQHT 
ETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMNVIFAAQKQNIIiIDACVLDSDSGLLQQACDITGGIiYLKV 
PQMPSLLQYLLWVFLPDQDQRSQLILPPPVHVDYRAACFCHRNL 
I E IGYVGS VCI»S I FCNFS P I CTTCETAFKIS LPPVLKAKKKKLK 
VSA 


6344 
~345 


2508 


147 


TMPTATLGNLRGYGMASPGLAAPSLTPPQLATPNLQQFFPQATR 
QSLLGPPPVGVPMNPSQFNI>SGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRLRSSEEPTEKEPPGQLQVKAQPOARWT 
VP KQTQTPDLL PEAU3AQVL P RFQ PRVLQVQAQVQ S QTQPR I PS 
j- w^r^JUivvAUiViii'ij.MijVJbyQKQVQPQLQQSAEPQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVQTQTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQPQVSLLAPEQTPWVHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVSMEEIQNESACX3LDVGECENRAREMPGVWGAGGSLKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRLGEIQHMSQACI,LSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REE WKGS ETY S PNTAYGVDFL VP VMG YI CRI CHKFYHSNSGAQ L 
SHCKS LGHFENLQKYKAAKNPSPTTRP VSRRCAINARNALTAXiF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 




2 


3483 

: 
i 

i 


PRVRTKLI LLjVNDKKRYERVGGGPKRIjGRDVEMEEM I EQLQEKV 

HEIjEKQNDTLKNRIj I SAKQQLQTQGYRQTPYNNVQSRINTGRRK 

ANENAGLQECPRKGIKFQDADVAETPHPMFTKYGNSLLEEARGE 

IRNLEWVIQSQRGQIEELEHIiAEILKTQLRRKENEIELSLLQLR 

EQQATDQRSNIRDNVEMIKLHKQLVEKSWALSAMEGKFIQLQEK 

QRTLK ISHDALMANGDE LNMQLKEQRLKCCS LEKQLH S M KFSER 

R I EELQDR INDLE KERELLKEN YDKL YDSAFS AAHEEQ MKLKEO 

QLKVQIAQLETALKSDLTDKTEILDRLKTERDQNEKLVQENREL 

QLQYLEQKQQLDBIiKKRIKLYNQENDINADELSEALLLIKAQKE 

QKNGOL3 FLVKVDS E I N KDLERS MRELCATHAET VQEL3KTRNM 

L I MQHKINKD YQMEVEAVTRKMENIiQOD YELKVEQ YVHLLD I RA 

ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 

EWLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 

TOGJLHPEYNFrSQYLVHVNDLFLQYIQKNTITLEVHQAYSTEYE 

riAACQLKFHElLEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 

^VPMDQAIRLYRERAKALGYITSNFKGPEHMQSLSQOAPKTAQI, 

5 STDS TDGNLNELK I T I R CCNHIjQSRASHIiQPHP YWYKFFDFA 

DHDTA 1 1 PS SNDPQ FDDHM YFP VPMNMDLDR YLKS ES IiS FYVFD 

)SDTQENIYIGKVNVPLI5LAHDRCISGIFELTDHQKHPAGTIH 
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~SEQ~ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first: 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide" 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=* Valine, 

^Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-p033ible nucleotide insertion) 



6346 



2921 



533 



VI IiKWKFAYIjPPSGS ITTEDLiGNFIR SEEPETWQRIjPPASS VST " 

lvlaprpkprqrltpvdkkvsfvdimphqsdvsqegsvdevken 
tekmqqgkddvsllsegqlaeqslassedeteitedlepeveed 

MSASDSDDCIIPGPISKNIKQPSEKIRIEIIALSLNDSQVTMDD 
T I QRLF VE CRF YS I, PAE ETPVSL PKP KSGQ WVY YNYSNV I YVDK 
ENNKAKRD I LKAI LQKQEMPNRSLRFTWSDPP EDEQDLECEDI 
G VAH VDLADWFQEGRDL I EQNIOVFDARADGEG IGKLR VTVEAEj 
HALQSVYKQYRDDLEA. 



6347" 



2921 



Q DRRLLRbELQ KTCQP TS TMSGSHTPACG PFS ALT PS I W PQE 1 C"~ 
AKYTQ K E ES AEQ PEFY YDEFG PR VY KEEGDEPGS S LIiANS PLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRAIAWLYPEIGYC 
QGTGMVAACLLLFIjEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQR VLRHL I VQ YL PRLDKLLQEHD I EIjS L I TLHWF LTAFAS W 
D I KLLLR I MDLF F YEGSRVIjFQLTLGMIjHLKEEE I* I QS ENS AS I 
FNTLS D I PSQM EDAELIjLG VAMRLAGS LTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTrTALLFGEDDLEAL 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENxVACSRSH 
RRRAKALLDFERHDDDELGFRKNDI ITI VSQKDEHCWVGELNGIi 
RG WFPAK FVE VLDERSKE YS I AGDDS VTEG VTDL VRGTL CPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVIiCSSLPT VEKW YQPWS FLRS PGWVQ I KC 

ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 



533 



6348 



3679" 



QDRRLLRLEI^QKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 

AKYTQKEESAEQPEFYYDBFGFRVYKEEGDEPGSSLLANSPIiME 

DAPQRJJRWQAHLEFTHNHDVGDliTWDKIAVSLPRSEKLRSLVUV 

GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 

QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRAIAWLYPEIGYC 

QGTGMVAACLLLFLEEEDAFWMMSAI IEDLLPAS YFS TTLLGVQ 

TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASW 

DIKLLLRIWDLFFYEGSRVLFQItTLGMLHLKEEELIQSENSASI 

FNTLSD I PSQMEDAE LLLG VAMRLAGS LT D VAVETQR RKHLAYL 

lADQGQLLGAGTLTNLSQVVRRRTQRRKSTITALLFGEDDIjEAIi 

KAKNI KQTELVADLREAI LRVARHFQCTDPKNCS WSRQLPGL1» 

PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 

RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 

RGWFPAKF VEVLDERSKE YS IAGDDS VTEGVTDLVRGTLCPALK 

ALFEHGL KKPSLLGGACH PWL F I EEAAGREVERDFAS VYS RLVL 

CKTFR LDEDGKVLTPEE LLYRAVQS VNVTHDAVHAQMDVKJjRS L 

ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRSPGWVQIKC 

ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRJDMLVKHHLFSW 
DVDG 

AGAEKCFVTL.WVCFIAKQQNKYKYEE CKDLIKSMLRNELQFKBE " 
KIAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTIiWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEICLRPOLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=*Asparagine, 
P^Proline, Q=Glut amine, R^rginine, 
S»Serine , T^Threonine , V=Valine , 
W^Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E CAI TCSNSHG P YDSNQPHRKTK I T FEEDKVDSTLI G S S SH VE W 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQ P YRSAPYVLEQQR VGLAVNMDE I E KYQE VEEDQD PS CPRLS R 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

Q YLGLALD VDR I KKDQ EEEEDQG P P CPRLSRE LLEWE P E VLQD 

SLDRC YSTPS S CLEQPDS CQP YGS S F YALBE KHVGFS LDVGE I E 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL ' 

LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNSMIiMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 

SFEEEHISFALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6349 
6350 


3 

* ! 


3679 


AGAEKCF VTLLAC FLAKQQNKYK YE E CKDL I ICS M IjRNELQFKE E 

KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EIILQAIjLTPDEPDKSQGQDLQEQIiAEGCRIAQHLVOKLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLhlHCAITCS 

NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 

ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQIA 

EKKQQFRNLKEKCFLTQ1ACFLANQQNKYKYEECKDLIKFMLRN 

ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 

DASRSLNEHLQALLTPDEPDKSQGQDLQEQbAEGCRLAQHLVQK 

LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 

ECAITCSNSHGP YDSNQPHRKTKITFEEDKVDSTL IGSS SHVEW 

EDAVH 1 1 PENES DDEEEEEKGPVS PRNLQESEEE E VpQES WDEG 

YS TL S I P PEMLAS YKS YSST FHS LEEQQVCMA VD I GRHRWD QVK 

KEDHEATGPRLSRELLDBKGPEVLQDSLDRCYSTPSGCLELTDS 

CQP YRSAFYVLEQQRVGLAVNMDE I BKYQEVEEDQDPS CPRLS R 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

Q YLGLALD VDR I KKDQEEEEDQGPP CPRLS RELLEWEPE VLQD 

SLDRCYSTPSSCLEQPDS CQP YGSSFYALEE KHVGFS LDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTD S CQP YRS AF Y I LEQQRVGLAVDMDE I E KYQE VEE 

DQD PS CPRLS RELLDEXEPEVLQDS LGRC YSTPS GYLELPDLGQ 
P YS S A VYS LE EQ YLGLALD VDRt KKDQEEEEDQGP PCPRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FE EEH I S FAL YVDNRFFTLTVTS LHLVFQMGVI F PQ 




3 " 


3679 

: 
] 

i: 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLI KSMLRNELQFKEE^ 

KIAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 

DDDED VQ VE VAEKVQ KS S S PR E MQKAEE KE VP EDSLE ECA ITCS 

^SHGPCDSNQPHKNIKirFEEDEVNSTLWDRESSHDECQDALN 

CLPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 

SKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKTILIKFMLRN 

JRQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 

)ASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 

*S PENDNDDDEDVQVE VAE KVQKS S APREMP KAEEKE VPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide' 
<A=Alanine, C»Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine 
H*=Histidine, I=Isoleucine, K^Lysine, 
L«beucine, M=Methionine, N^Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /..possible nucleotide deletion, 
\~possible nucleotide insertion) 


6351 






ECAITCSN5HGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEBEVPQESWDEG 

YS TLS I P P EMLAS Y KS YSST FHS LE EQQ VCMAVD IGRHR WDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAPYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDIiGQPYSSAVYSLEE 

QYLGLALDVDRIKKDQEEEEDQGPPCPRLSREIiLEVVEPEVLQD 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRIjSRELLDEKG 

PEVLQDSLDRCYSTPSGCL.ELTDSCQPYRSAFYILEQQRVGLAV 

DMDE I E KYQE VEEDQD PSCPRL S GELLDE KE PE VLQ E S LDRC YS 

TPSGCXjELTDS CQP YRSAF Y I LEQQRVGLAVDMDE I E KYQEVEE 

DQD PS CPRLS RE LLDEKEP EVLQDS LGRC YS T PSG YLELPDLGQ 

PYSSAVYSLEEQYLGLAbDVDR I KKDQEEEEDQGPPCPRI»SREL 

IiE WE PE VLQDSLDRC YSTPS S GbEQPDS CQP YGS S F YALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNS^3LMEVEEPEVI J QDSLD^CYSTPSMYFEI J PDSFQHYRSVFY 

S FEEEHIS FALYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6352 


1291 


319 


KKARRRTEKSQLGRMLVVEVANGRSLVWGAEAVQALRERLGVGG 
RTVGALPRGPRQNSRLGLPLLLMPEEARIiLAEIGAVTLVSAPRP 
DSRHHSLALTSFKRQQEESFQEQSALAAEARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKEDETSDGQASGEQEEA 
GPSSS QAGPS NG VAPLP RS ALI>VQIiATAR PRP VKAR PLD WR VQS 

KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 

PLRFHAHYIAQCWAPEDTIPLQDLVAAGRLGTSVRKTLLLCSPO 
PDGKWYTSLQWASLQ 


6353 


235 


923 


WSEWLSPOlAAKCKGLSMLRITMKTRAiSLAADATEFVQGRSAP 

AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 

I^GNMNPEGGVNHENGMlSfRDGGMIPEGGGGNQEPRQQPQPPPEE 

PAQAAMEGPQPENMQPRTRRTKFTH,QVEELESVFRHTQYPDVP 

TRRELAENLGVTEDKVRVWFKNKRARCRRHQRELMLANELRADP 
DDCVYIWD 


6354 


65 


672 


Kir'ACiAGAI P EARAR PPt) VQAAEEE KEM UUPUS AS R VFCGR I L5M 
VNTDDVNAI II»AQKNMLDRFEKTNEMLLNFNNLS S ARLQQMS ER 
FLHHTRTLVEMKRDLDSIFRRIRTLKGKLARQHPEAFSHIPEAS 
FLEEEDEDP I PPS TTTT I ATS EQSTGS CDTS PDTVS PS LS PG FE 
DLSHVQPGSPAINGRSQTDDEEMTGE 


~~6T5S 


965 


510 


^aiiKPMEPTRDCPLFGGA^SAXLPMGAIDVSDLfePVPDNQEVFC - 
HPVTDQSLIVKLLELQAHVRGl2AAARYHFEDVj3GVQGARAVHVE 
S VQPLS LENLALRGRCQEAW VLSGKQQ IAKENQQVAKD VTLHOA 
LLRLPQYQTDLLLTFNQPP 


6356 


158 


16*2 

3 
J 
I 


KWS&AAi-KeSGIiRGAMlKRVLPHGMGRGLLTRRPGTRRGGFSLD 
WDGKVSEI KKKI KS I LPGRS CDLLODTS HLP P EHS D WI VGGG V 
LGLS VAYWIjKKLESRRGAIRVIj WERDHTYS QASTGLS VGG I CO 
QFSLPENIQLSLFSASFLRNINEYLAWDAPPLDLRFNPSGYLL 
LASEKDAAAMESNVKVQRQEGAKVSLMSPDQLRNKFPWINTEGV 
AIiAS YGME DEGWFDP WCL LQGLRRKVQ. SLGVL FCQGEVTR FVS S 

SQRMLTTDDKAVVLKRIHEVHVKMDRSliEYQPVECAIVINAAGA 
^SAQIAAZAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPQGPGL 
STPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
"FQDKVW PHIiALR VP AF ETLK VQ S A W AG Y YD YNT FDQNG WG PH 

PLWNMYFATGFSGHGLQQAPGIGRAVAEMVLKGRFQTIDIiSPF 
jFTRFYLGEKIQENNII 




354 


£33 1 

\i 


CGI.TSSCl.Pi.QVMMTKRTKDMGKFSSVTVyTlDEEEEEIEAREv" 

U)SYAQNAKVIEKQLERKGMSKRRI^BI^I^KKAKMKGTLID 
fQFK 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 

nucleotide 

location 

corresponding 

to first 
1 amino acid 
[ residue of 
1 amino acid 
1 sequence 


Amino acid segment containing signal peptide '" 
(A*=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 

P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine , V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


i 2 


91S 


GIiLRNMALliVRVLRNQTSISQWVPVCSRLlPVSPTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
QP VEE KVGAFTKI IEAMGFTOPT_.TrVQff WW T v t jv at bMvrc r« t-ov 

TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLWMKQEGRSGKYM 
CR 1 I VH FM W ED VQQRGRVMGVNP Y I LKKNM I LMTNHF YAAI LG Y 
DEGILSDDHGLAAALWRTFFNRKCEDPRHLELLVEYVRKQIQYL 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1 1040 


ASDALHSLSAPVLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
A 1 S KTAVAP I ER VKLLLQ VQHAS KQ X AAD KQ YKG I VD C I VR I P K 

EQGVLSFWRGNLANVIRYFPTQALNFAFKDKYKQIFLGGVDKHT 
OFWRYFAGNTjJlQf2r22k&<'23\T , CT rnrv m nDTtnntnr -n ^ « T r ^ T , 

REFRGLGDCLVKI TKSDGIRGLYQGFSVSVQGI 1 1 YRAAYFGVY 

DTAKGMLPDPKNTHIWSWMIAQTVTAVAGWSYPFDTVRRRMM 
MQSGRKGADIMYTGTVDr*MPTf TPBnpr , r , vair"c»v/ajvrjc!'KTirT r>/-»m^ 

GAFVLVLYDELKKVI 


6359 


98 


1086 


VCRQEEEKMKEDCbPSSHVPISDSKSIQKSELLGLLKTYNCYHE" 
GKSFQLRHREEEGTLIIEGLLNIAWGLRRPIRLQMQDDREQVHL 
x 0 w n r xuijro i_ .f jj J\B ±* £> ir\jNGN X TAQGPS I QP VHKAES S TDSS 
GP LEE AE EA PQLMRT KS DAS CMS QRR PKCRAPGEAQR I RRHRFS 
1NGHFYNHKTS VFTPAYGS VTNVR VNSTMTTLQ VLTLLLNKFRV 
EDGPS E FAL YI VHESGERTKLKDCE YPLISR I LHG P CE KI AR I F 
LMEADLGVEVPHEVAQYIKFEMPVLDSFVEKLKEEEEREIIKLT 
MKFQALRUTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTIiEE WIjP PRS CR VFW I HSGTTMS KVS FK I TLT SDP 1 

RLPYKVLSVPESTPFTAVLKFAAEEFKVPAATSAIITKDGIGIN 
PAQTAGNVFLKHGSELRI I PRDRVGSC 


6361 


615 


158 


RPGLGQLQHGAIAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ 
t- F^-f a jjan a t-ijf iJc>o i v/\vir'Jjvj/\j. J jL AVASTIjSVEHNDGVETGIWAC 

APGRWRRQITSQEFCHFIQGRCTFTPDDGETLHIQAGDALMLPA 
NSTG I WDIQETVRKTYVIiI I* 


6362 


350 I 


1576 


nMiAiSHSAALKLQQLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWS I ALNLQK YCH I RIiAGS KDPRA YFKTKTWWLGL FLMLLG 
ELG VFAS YAFAPLSLIVPLSAVS VI ASAI IG 1 1 F I XEKWKPKDF 

lrryvls fvgcglawgtyll vt fapns hekmtgenvtrhl vs w 
pfllymlveiilfclllyfy:<eknaiwiwilllvallgsmtvv 

TVKAVAGMLVLS I QGNLQLD YP I F YVM F VCMVATAVYQAAFLSQ 

ASQMYDSSLIASVGYIIjSTTT ATTaOZXTTTVT n»Tri?mrT ut^u 

ALGCLI AFLGVFL I TRNRKK? I PFEP Y I SMDAMPGMQNMHDXGM 

TVQPELKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKE 


6363 

63 CA - 


21 

21 I 


1201 


RRTRLGSSFPRRRDSSAMESYDVIANQPWIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLS I 
RYPMBHGIVKDWNDMERIWQYVYSKDQIiQTFSEEHPVLLTEAPL 
NPRKKRERAAEVFFETFNVPALFISMQAVLSIiYATGRTTGWLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRJCEGY 
DFHSSSEFEI VKAI KERAC YLS INPQKDETLETEKAQY YLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRIXSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGSILASLDTFKKMWVSKKEYEEDGARSIHRKTF 








1201 

: 


RRTRLGS S F PRRRJDS 3AM B S YDVI ANQ PWI DNGSG VT KAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
MPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPI YEGFAMPHS IMRIDIAGRDVSRFLRLYliRKEG Y 
DFHSSSEFE I VKAI KERAC YLS INPQKDETIjETEKAQYYLPDGS 
riEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=»Alanine, OCysteine, D=Aspartic Acid, E=* 
Clutamic Acid, F~ Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTLFSNIVLSOGSTLPKGPGDRLLSEVKKIAPKDVKIRISAPQE 
RLYSTWIGGS IIiASLDTFKKMWVS KKEYEEDGARSIHRKTP 


6365 


234 


1999 


KHKS RAS CAARAQAFG PS RERE VHS R FRSGLRRIiGESifSGCCTM 
ASMGTLAFDEYGRPFLIIKDQDRKSRLMGLEALKSHIKAAKAVA 
NTMRTS LGPNGLDKMWDKD3DVTVTNDGATI LSMMDVDHQI AK 
LMVELSKSQDDE IGDGTTGVWLAGAIjLEBAEQLLDRG IHP IRI 
ADG YEQAARVAIEHLDKI SDSVLVDI KDTEPL I QTAKTTLGS KV 
VNSCHRQMAEIAVNAVLTVADMERRDVDFELIKVEGKVGGRLED 1 
TKLIKGVIVDKDFSHPQMPKKVEDAKIAILTCPFEPPKPKTKHK 
LD VTS VED Y KALQKY F KEKFEEM I QQ I KETGANLA ICQWG FDDE 
ANHLLLQNNLPAVRWVGGPEIEliIAIATGGRIVPRFSELTAEKL 
GFAGLVQE I S FGTTKDKMLVIEQCKNSRAVTI FI RGGNKMI I EE 
AKRS LHDALCVIRNLIRDNRWYGGGAAEISCALAVSQEADKCP 
TLEQ YAMRA FADALEVI PMALSENSGMNPIQTMTEVRARQVKEM 
NPALG IDCLHKGTNDMKQQHVIETLIGKKQQ I SIiATQMVRM ILK 
IDD1RKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFW VLLS 1 FLGAVAMLCKEQGl *TVTjGLNAVFDILV 
IGKFNVLEIVQKVLHKDICSLEMLGMLRNGGLLFRMTLLTSGGAG 
MLYVRWRIMGTGPPAFTEVDNPAS FADSMLVRAVNYNYY YSLNA 
WLLLCPWWLC FDWSMGCI PL IKS I SDWRVI ALAALWFCLIGL I C 
QALCS EDGHKRR I LTLGLG FLVI P FLPASNLFFRVGFWAERVL 
YLPSVGYCVLLTFGFGALSKHTKKKKLIAAWIiGILFINTLRCV 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKYVHAMNNLGNILKERNELaEAEELLSLAVQIQ 
PDFAAAWMNLG I VQNSLKRFEAAEQS YRTAI KHRRKYPDC Y YNL 
GRLYADLNRH VD ALNAWRNATVLKPEHS LAWNNM 1 1 LL DNTGNL 
AQAEAVGREALEL I PNDHSLMFSIANVLGKSQKYKESEALFLKA 
I KANPNAAS YHGNLAVL YHRWGHLDLAKKHYEI SLQLDPTASGT 
KE N YGLLRRKLELMQ KKAV 


63 67 


287 


1934 


S IGFP VMLVLS I LLYTCEMFQDSVAFEDVAVS FTQEEWALLDPS 
QKNLYRDVMQETFXNLTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCES KESHHCGESFNQIADDMLNRKTLPGI TPCESSVCGEVGT 
GHSSLNTHI RADTGH KSS E YQE YGENP YRNKECKKAFS YLD S FQ 
SHDKACTKE KP YDGKECTETF I S HSCI QRHRVMHS GDGP YKCKF 
CX5KAFYFLNLCLIHERIHTGVKPYKCKQCGKAFTRSTTLPVHER 
THTGVNADECKE CGNAFSFPS E IRRHKRSHTGEKP YBCKQCGKV 
F I SFS S I Q YHKMTHTGEKPYE CKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTKTEDKPYGCKQCGKGFRCA 
SQLQIHERTHSGEKPHECKECGKVFKYFSSLRIHBRTHTGEKPH 
ECKQCGKAFRYFSSLHIHERTHTGDKPYECKVCG KAFTCSSS I R 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAF I RAS S CREHERTHT INR 




6368 


1 


327 


RPVPAKLN PR S WPRTAGALPLRP P PLTMAVFHDE VE I EDFQYDE " 
D S 3TYF Y PCP CX3DNFS I TKEDLENORnVATTP r <5 t. T t VT7Tvr« v 
DQFVCGETVPAPSANKELVKC 




6369 


1 


1745 


AGCCRDTRFPTPRGPGSLCHNFCRSAACT\n.*RTIHGSPREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 
FKNLTSVGKTWKVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 
E S FNQ I ADDM LNRKTL PGI T P CES S VCGEVGTGHSS LNTHI RAD 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKEKPY 
DGKE CTETF I S HS C I QRHRVMHSGDGP YKCKFCGKAF YFLNLCL 
IHERIHTGVKPYKCKQCGKAFTRSTTLPVHERrHTGVNADECKE 
CGNAFS FPSE IRRHKRSHTGEKP YE CKQCGKVF I S FS S IQYHKM 
THTGEKP YE CKQCGKAFROGSHLQKHGRTHTGEKPYE CRQ CGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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* w i";yMiciiL Lunuainxng Slyllcl Peptide 

(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, lolsoleucine, K^Lysine, 
LsLeucine, M=Methionine, N-Asparagine , 
P= Proline, Q=Glutamine, R«Arginine, 
S« Serine, T»Threonine, V» Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










S S LH X HEKTHTGD KP YE CKVCGKAFTCS SSI RYHljftTHTG EKP Y 
cv-ivriv_^jv«.r AoWiXKxHBRTH J\»£»KPYQCKQCGKAFIRASSCRE 
HERTHTINR 




6370 


1711 


329 


F VLS EQR bKTE RTW PRS PGLGRGAAAAGARTAG AGLLRLL LG CG 

AL VGGLRPVTMTTP AMAOfja CVTUCT Or Vut unTn^ok vuwtm*. 

v wwj-i*v*- v a ru. a JrAN/vurirloAl W£iij£>JjiEIjHRTPQEAIMDGTE 
I AVS PRSLHS ELMCP ICLDKLKNTMTTKECLHRFCSDCI VTALR 
S GNKE CPTCR KfCD VS KRSLR PPPNFDAL IS K I YPSR EE YEAHQD 
RV1» I R LS RLHNQQ ALS SSI EEGLRMQAMHRAQR VRR P I PGS DQT 
TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPS 
P PE PGGE I ELVFRPHPLLVEKGEYCQTR YVKTTGNATVDHLSKY 
LALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTIjELVNE 
KFWKVSRPLELCYAPTKDPK 




6371 


3 


288 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSFKEKFMKC 
LHNNNFENALCRKESKEYLECRMERKIjMLQBPLEKLGFGDLTSG 




63 72 


2141 


625 


RVSAIASEGKAEERYKKLEDLLEKSFSLVKMPSLQPVVMCVMKH 

lpkvpekklklvmadkelyracavevrrqiwqdnqalfgdevsp 
llkqyil.ekes alfs telisvlhnffs pspktrrqgewqrltrm 
vgktnrklydmvlqflrtlflirtrnvhyctlraellmslhdldvg 

E I CTVDPCHKFTWCLDACIRERFVDS KRARELQG FLDG VKKGQE 
QVLGDLSMILCDPFAINTIiALSTVRHLQBLVGQETLPRDSPDLL 
LLIiRLLALGQGAWDK I DS QVFKEP KMEVEL I TR FL PMLMS FLVD 
DYTFNVDQKLPAEEKAPVSYPNTLPESFTKFLQEQRMACEVGLY 
YVLHITKQRNKNALLRLLPGLVETFGDLAFGD I FLHLLTGNLAL 
LADEFALED FCS S LFDGF FLTAS PRKENVHRHALRLLIHLHPR V 

APSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 




6373 


67 


711 


PSRAARAS PARLPAMVSWI ISRLWLI FGTLYPAYYSYKAVKSK 
DIKEYVKWMMYWIIFALFTTAETFTDIFLCMFPFYYELKIAFVA 
WLLS P YTKGSSLL YRKFVHPTLSSKEKE IDDCLVQAKDRS YDAL 
VHFGKRGL^AATAAVMAASKGQGALSERLRSFSMQDLTTIRGD 


6374 
6375 " 


S35 


2105 


HKLFCSYIST3EFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS - 
CPTHT FCNYTS S T I FLSSTRDHSCPTHTSCNYTS ST I FLS S TRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPAEI,QTEGSNGXKEVLSGFQWLEDTVLFPEGGGQPnDRGTlN 
DI S VLR VTRRG EQADHFTQTPLDPGSQVLVRVD WERR FDHMQQH 
SGQHL I TAVADH LFKXiKTTS WE LGRFRS AI ELDT PSMTAEQVAA 
I EQS VNE K I RDRLP VNVRELSLDDPE VEQ VSGRGL PDDHAG P I R 
VVN I EGVDSNMCCGTHV S NLS DLQ VI KI LGTE KGK KNRTNL I FL 
SGNR VLKWMERSHG TE KALTALLKCGAEDHVEAVKK J iQNST KI L 
QKNNLN LLRDLA VH I AHS LRNS PD WGG Wl LHRKEGDS EFMN 1 1 
ANEIGSEETLLFLTVGDEKGGGLFLLAGPPASVETLGPRVAEVL 
EGKGAGKKGRFQGKATKMS RRMEAQALLQD YI STOSAK3 




1 


1535 


aimaaatrpvrlpeagcegrercwnpsrsrshsgegglaawsrt 
cpgr prrpgqqvvrg ptmlvtaylafvgllasclglels rcrak 
ppgracsnpsflrfqldfyqvyflaiiaadwlqapyiiyklyqhyy 
flegqiailyvcglastvlfglvassi»vdwlgrkkscvi,fslty 

SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWrPATFARAAFWNHVUVWAGVAAEAVASWIGLGPVAP 
FVAAIPLUUAGALALRNWGENYDRQRAFSRTCAGGLRCLLSDR 
R\^LLGTIQALFESVIFIFVFLWTPVIiDPHGAPLGIIFSSFMAA 
SLIiGSSLYR IATSKRYHIiQPMHLLSLAVLI WFSLFMLTFSTS P 
3QESPVESFIAFLLIELACGLYFPSMSFLRRKVIPBTEQAGVLN 
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SEQ 
ID 
NO; 


| Predicted 
beginning 

1 nucleotide 

1 location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=»Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S=Serine, TaThreonine, V=Valine, 
W=Tryptophan, ^Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFRVPLHS LACLGLLVLHD S DRKTGTRNMFS I CSAVM VMALLA V 
VGLFTWRHDAELRVPSPTEEPYAPEL 


6377 


380 


143 7 


ISSTDIDHYRPSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ ' 
QDLL I AALGMKLGS P KSS VT I WQPLKLFAYS QLTS LVRRATLKE 

NEQIPKYEKIHNFKVHTFRGPHWCBYCANFMWGLIAQGVKCADC 
GLNVHKQCSKMVPNDCKPDLKH\nCKVYSCDLTTLVKAHTTKRPM 
WDMCIREIESRGLNSEGLYRVSGPSDLIEDVKMAFDRDGEKAD 
ISVNMYEDXNIITGALKLYFRDLPIPLITYDAYPKFIESAKIMD 
PDEQLETLHEALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLGIVFGPTLMRSPELDAMAALNDIRYQRLWELLIKNEDIIiF 


6378 


2311 


184£ 


SRIRRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE 
QRVEDVRLIREQHPTKI PVI lERYKGEKQLPVLDKTKFLVPDHV 
NMSEL I KI I R RR LQLNANQ AFFLL VNGHS M VS VSTP I SEVYESE 
KDEDGFLYMVYASQETFGMKLSV 




j 606 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVA"~ 
DLAL I PD VD I DS DG VFKY VLI R VHSAPRSGAPAAE S KE I VRG YK 
WABYHADIYDXVSGDMQKQGCDCECLGGGRISHQSQDKKIHVYG 
YSMAYGPAQHAISTEK I KAKYPD YEVTWANDGY 


6379 


35 


378 


BRAG S PS PS RAAIiR RCAPQRSQAPRWPDRAACRRS FQGSQGRAY 
liFNS WNVG CG PAEER VIjIiTGLHAVAD I YCENCKTTLG W KYEHA 
FESSQKYKEGKYIIELAHMIKDNGWD 


6380 
6381 


1414 


462 


PAVQGQRGAGPRrGRGSGNMARFALTVVRHGETRFNKEKIiQGQ* - 
G VDE PIjS ETG FKQAAAAG I FXNNVKFTHAFS SDLMRTKQTMHGI 
LERS KFCKDMT VK YDSRLRERKYG WEGKAL S EL RAMAKAAREE 
CP VFTP PGGETLDQVKMRG I D FFE FLCQL I LKEADQKEQ FSOGS 
PSNCLETS1AEIFPIX3KNHSSKVNSDSGIPGLAASVLVVSHGAY 
MRSIiFDYFLTDLKCSLPATLSRSELMSVTPOTGMSLFIINFEEG 
R EVKP TVQ C I CMNLQDHLNGLTENS LGLNLPS KSNHFE PLKG VP 
LALFTSLLC 


6382 


1668 


218 


AWRAQGSRGFSGAGWRPRQAAAMNFSEVFKLSSLLCKFSPDGK 
YLAS C VQ YRL WRD VNTLQ I LQLYTC LDQIQH I EMS ADS LF I LC 
AM YKRGLVQVWSLEQPEWHCKIDEGS AGLVAS CWSPDGRHI LNT 
TEFHLRITVWSLCTKSVSYIKYPKACLQGITFTRDGRYMA1AER 
RDCKDYVS I FVCSDWQLLRHFDTDTQDLTGIEWAPNGCVLAVWD 
TCLB YKI UjYSLDGRLLST YSAYE WS LGI KS VAWS PSSQFItAVG 
SYDGKVR1LNHVTWKMITEFGHPAAINDPKIWYKEAEKSPQLG 
hGCLS FP PPRAGAG PL PSSES KYE IAS VP VSLQTLKP VTDRANP 

KIG IGMLAFS PDS YFLATRNDNIPNAVWVWDIQKLRLFAVLEQL 
SPVRAFQWDPQQPRLAICTGGSRLYLWSPAGCMSVQVPGBGDFA 

vlslcwhlsgdsmallskdhfclcfleteawgtacrqlgght 


6383 


2 


1062 ■ " 


^fcUJEDRNLCLIAYPLKGDHGIVDIVDNSDCEPKSKLLRWTTMK " 
KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 

tvekkgnissqliotynpwsmkchqqqlqrmkenakhrnqykfil 
lenltsryevpcvldlkmgtrqhgddaseekaanqirkcqqsts 
avigvrvcgmqvyqagsgqlmfmnkyhgrklsvqgfkealfqff 
hngrylrrellgpvlkkltelkavlerqesyrfysssllviydg 
kerpewldsdaedledlseesadesagayaykpigassvdvrm 
idfahttcrlygedtwhegqdagyifglqslidivteiseesg 

5 




3159 


1061 

( 
j 
\ 
i 
C 
1 


spapgkpsphgsqpaaraaaapampsakqrgskgghgaaspsek"" 

3 AH PS AAR PLAAPT PAAPACRS PS PGGAPAS FPGRAPRS LAS Q P 
^ARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
?PPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
JSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
SWCVHHVLEEVQQVRRSHQDFSRQREEIjGQGLOGVEQKVQSLQa 

'fgtfesilrssqhkqdltekavkqgesevsrisevlqklqnbi 



496 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 
NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T= Threonine, V« Valine, 
W=Tryptophan, Y*=Tyrosine, X=UnJcnown # *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKDLSDGIHWKDARERDFTSLENTVEERLTSLTKS INDNI AI P ' 
TEVQKRS QKE IN DM KA K V AS IiE ESEGNKQDL KALKEAVKE I QTS 
AKS RE WDMEALRS TLQTMES D I YTEVREL VS LKQEQQAFKEAAD 
TE R LALQAIjTEKL»LRS 2ESVSRLPEEI RRLEE ELRQLKS DS HG P 

kedggfrhseafealqqksqgldsrlqhvedgvlsmqvasarqt 
esleslls ksqeheqrlaax.qgrlegijgss eadqdgiiastvrsl 
getqlvlygdveelkrsvgelpstveslqkvqeqvhtllsqdqa 
qaarlppqdfldrlssldnlkasvsqveadlkmlrtavdslvay 
svkietnennlesakgllddlrndldrlfvkvekihekv 


6384 


738 


1904 


I W E VP VC LTH bLHLQQANQ PI*P P PS S S I NEEDADE ANRAI GE KR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYALFFRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVSKAAAESAEAQTIRSVQQTLASTNLTSSLLLNTPLSQ 
HGTVSASPQTLQQSLPRSIAPKPLTMRLPMNQIVTSVTIAANMP 
SNIGAPL1SSMGTTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ 
QQQMQQMQQQQLQQHQMHQQIQQQMQQQHFQHHMQQHLQQQQQH 
LQQQINQQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVASQI 
TS P I PAI GS PQPASQQHQSQIQ S QTQTQVLSQ VS I F 


6385 


2 


1584 


PR VRAAD VAAGAQAWS AGMAKS NGENG PRAPAAGESIiSGTRES 
LAQGPDAATTDELS SLGSDSEANGFAERRIDKFG FI VGS QGAEG 
ALEE VPLEVLRQRESKWLDWILNNWDKWMAKKHKKIRLRCQKCil P 
PSLRGRAWQYLSGGKVKLQQNPGKFDEIiDMSPGDPKWLDVIERD 
LHRQFPFHEMFVSRGGHGQQDIiFRVLKAYTLYRPEEGYCQAQAP 
IAAVIiLMHMPAEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDMFFCEGVKIIFRVGLVLLKHALGSPEKVKACQGQYETIBR 
LRS LS P K I MQEAFLVQE WEIiPVTERQI EREHL I QLRRWQET RG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRIiPLDAPLPGS 
KAK P KP PKQAQKEQRKQMKGRGQLEKP P APNQAM WAAAGD ACP 
PQHVP P KDS AP KDS AP QDIAPQVS AHHRSQESI/TS QES EDT YL 


6386 


819 


195 


TVCGS F YLG IMQRASRLKRELHMLATEP^PG J! TCWQDKDQMDDL ' 
RAQILGGANTPYEKGVFKLEVIIPERYPFEPPQIRFLTPIYHPN 
I DS AGR I CLDVLKLPPKGAWR PSLN I ATVLTS I QLLMS E PNPDD 
PLMAD I S SE FKYN KPAFLKNARQ WTE KHARQKQKADEEEMIiDNL 
PEAGDSRVHNSTQKRKASQLVGI BKKFHPDV 


6387 


1 


662 


PGPTHASADAWADAWAQPKMAMHNKAAPPQI PDTRREXAEJL-VKR 
KQEIiAETLANIiERQ I YAFEGS YLEDTQM YGNI IRGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
S TS SGS HHSSH KKRKNKNRHS PSGMFD YD FE I DLKLNKKPRAD Y 


6388 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELABLVKR 
KQELAETLANLBRQIYAFEGSYLEDTQMYGNIIRGWDRYIiTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTS PDFHNQENEPSQED P EDLDGS VQGVKPQKAAS 
S TS SG S HHS SH KKRKNKNRHS PSGM FD YD FE IDLKLNK KPRADY 


6389 


1074 


497 


AEPGDRMAGHRLVLVLGDLHIPHRCNSLPAKFKKLLVPGKIQHI 
LCTGNLCTKESYDYLKTtiAGDVHIVRGDFDENLNYPEQKVVTVG 
QFKIGL IMGHQVI PWGDMASLALLQRQFDVDILISGHTHKFBAF 
EHENKFYINPGSATGAYNALETNIIPSFVLMDIQASTWTYVYQ 
IiIGDDVKVERIEYKKP 


6390 


158 


535 


GEE R KEGRAPG KAFAPERNPAKME KEETTR ELLL PN WQGSGSHG 
LTIAQRDDGVFVQEVTQNSPAARTGWKEGDQIVGATIYFDNLQ 
SGEVTQLLNTMGHHTVGtiKLHRKGDRFFPSLGQTWDP 


6391 


5386 


2897 


VRWNSKTECYIiSIQTQENFPANLNELVNCIVISSLVT^RKIJ<A 
MSLLGSRNQLARAVLNPNPMDFCTKDLLTTTSERI lAYLRDFNE 
DQKKAI ETAYAM VKHS PS VAK I CL I HGP PGTGKS KT I VGLLYRI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 

1 corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
HaHistidine, I*=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, ^Tyrosine, X=Unknovn, *-stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6392 






IiTBNQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKI I LEF 1 
KEKCKDKKNPLGNCGDINLVRLGPEKSINSEVLKFSLDSQVNHR 
MK KELPS HVQAMHKRKEFLDYQ LDEIjS RQRALCRGGRE IQRQEL 
D EN I S KVS KERQE LAS KI KEVQG R PQKTQS III LESH I 1 CCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRCN 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
M I SRL P I LQLTVQ YRMHPDI CL FPSN YV YNRN LKTNRQTEAI R C 
SSDWPFQPYLVFDVGDGSERRDNDSYINVQEIKLVMEIIKLIKD 
KR KD VS FRN IG I ITH Y KAQKTM I Q KDLDKE FDRKGPAE VDTVD A 
FQGRQKDCVIVTCVRANSIQGS IGFLASLQRLNVTITRAKYSLF 
I LGHLRTLMENQHWNQL IQDAQ KRGAI I KTCDKNYRHDAVKI L K 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDSKEITLTVTSKDPERPPVHDQLQDPRLLKRMGIEVKGG 
I FLWD PQPS S PQHPGATPPTGEPGFPWHQDLSHVQQ PAAWAA 
LSSHKPPVRGEPPAASPEASTCQSKCDDPEEELCHRREARAFSE 
GEQEKCGSETHHTRRNSRTOKRTLEQEDSSSKKRKLL 1 




972 


[ 186 


grtgvdi^ssmahrlqirlltwdvkdtllrlrhplgeayat'kSrH 

AHGLEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 
VLQTFH LAG VQDAQAVAP I AEQL YKDFSHPCTWQVLDGAE DTLR 
ECRTRGLRLA VI SNFDRRLEGILGGLGLREHFDF VLTS EAAGWP 
KPDPRIFQEALRLAHMEPWAAHVGDNYLCDYQGPRAVGMHSFL 
WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGT, 1 


6393 
6394 


2017 


730 


TGGSKMAAVATCGS VAASTGSAVATASKSNVTS FORRGPRASVT 
NDSGPRLVS I AGTR P S VRNGQLLVS TGLPALD QLLGGGLAVGTV 
LLIEEDKYNIYSPLLFKYFLAEGIVNGHTLLVASAKEDPANILQ 
ELPAPLLDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 
IG P VS SSRFGH YYDAS KRMPQEL I E ASNWHG F FL PEK I S STLKV 

EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGS PLWGDD I CCAENGGNSHSLTKFLYVLRGLLRTSLS ACI ITM 
PTHLIQNKAI IARVTTLSDWVGLESFIGSERETNPLYKDYHGL 
IHIRQIPRI.NNLICDESDVKDLAFKDKRKLFTIERLHLPPDLSD 
TVS RS S KMDLAESAKRLG PGCGMMAGGKKHLDF 


6395 


1418 j 


511 


^/WU^KUAKKRPAAMAT VMAATAAERAVLEEE FRWLLHDEVHAI 
VL KQ LQD I LKEAS LRFTL PGSGTEGPAKQENFI LGS CGTDQ VKG 
VLTLQGDALSQADVNLKMPR1WQLLHFAFREDKQWKLQQIQDAR 
NH VS QAI YL LTS RDQS YQ FKTGAEVLKLMDAVMLQLTRARNRLT 
TPATLTLPEIAASGLTRMFAPALPSDLLVNVYINLNKLCLTVYQ 
LHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRLEVSHVHKVEC 
VIPWLNDALVYFTVSLQLCQQLKDKISVFSSYWSYRPF " \ 


" 63 96 


13 


658 


PSGRPTRPLCCAARKGAARHGGSVSGWPAGRTPTErSNPGSSVM " 
ESVTFEDVAVEFIQEWALLDSARRSLCKYRMLDQCRTLASRGTP 
PCKPS CVSQLGQRAEPKATERGILRATGVAWESQLKPEELPSMQ 
lHjIjH h, ASS RDMQMG PGLFLRMQLVPS IEERETPLTREDRPALQE 
PPWSLGCTGLKAAMQIQRVVIPVPTLGHRNPWVARDSGE | 


" 6397 


1 


1221 

r 


anilsspskkgqkgtligyspegtplynfmgdafqhssqsiprfH 
ikeslkqileesdsrqifyflclnllftfvelfygvltnslgli 
sdgfhmlfdcsalvmglfaalmsrwkatrifsygygrieilsgf 
inglfli vi affvfmesvarli dppeldthmltpvs vggli vnl 

IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
3S AGGGMNANMRG VFLHVLADTLGS IG VI VSTVL I EQFG WFIAD 
PLCSLFIAILIFLSWPLIKDACQVLLLRLPPEYEKELKIALEK 
I QKIEGLISYRDPHFWRHSAS I VAGT I H I Q VTSD VLEQR I VQQV 
TG I LKDAG VNNLT I Q VE KEAYFQHMSGLS TGFHDVLAMT KOME S 
1KYCKDGTYIM 




391 


122 C 
I 


iAGUVGRFEAlRAPARMIEWCNDRIXSKKVRVKCNTDDTIGDLlH 
CLIAAQTGTRWNKIVLKKWYTIFKDHVSLGDYEIHDGMNLELYY 
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amino acid 
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Predicted end" 
nucleotide 
location 
! corresponding 
to first 
amino acid 
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amino acid 
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6398 



353 



1306 



6399 



75 



1245 



Amino acid segment containing signal peptide 
(A=Alanine. C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G^Glycine, 
H=»Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, ^Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 



6400 



2520 



"1053 



6401 



"109- 



766 



"6402 



1196 



279 



HKQMGPLIWRCKKILIiPTTVPPATMRI WLLGGLLPPLLIiLSGLQ " 
RPTEGSEVAIKIDPDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
OI BAQKNYPRMWQKAHLAMIiNQGKVLPQNMTTTHAVAILFYTLN 
SNVHSDPTRAMAS VARTPQQ YERS FHFKYLH Y YLTSAIQLLRKD 
SIMEMGTLCYEVHYRTKDVHFNAYTGATIRFGQFLSTSLLKEEA 
QEFGNQTLFTIFTCLGAPVQYFSLKKEVLIPPYBLFKVINMSYH 
PRGDWLQLRSTGNLST YNCQLLKASS KKCIPDP IAIASLS FLTS 
VIIFSKSRV 

PNLETYFGKKCEKDSMNFTPTHTPV CRKRTWSKRGVAVSGPTK 
RRGMADS LES T P LPS PEDRLAKLHPS KE LLE Y YQ KKMAECE AEN 
EDLLKKL EL YKE ACEGQHKIjECDLQQREEE IAELQKALS DMQ VC 
LFQER EH VLRL YS ENDRLR I RELEDKKK I QNLLALVGTDAGE VT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RR IHLEE I QVQHQRNQNKI KE LTKNLHHTQELLYES TKDFLQLR 
SENQNKEKSWMLEKDNLMSKIKQYRVQCKKKEDKIGKVLPVMHE 
SHHAQSEYI KVMS LCRNE WYFSGKVEO I PKNLQFVM 
K.TMKCPE W y E VQ5 A I LRHN CG YAMKT G KFFHNLMER KDFET WL 
DNISVTFLSLTDLQKNETLDHLISLSGAVQLRHLSNNLETLLKR 
DFLKLLPLELSFYLLKWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGMQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFETS 
S L IGHS ARVYAL Y YKDGLL CTGS DDLS AKLWD VS TGQC VYG I QT 
HTCAAVKFDEQKLVTGSFDNTVACWEWSSGARTQHFRGHTGAVF 
SVDYNDELDILVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLQKCKVXSLLHS PGDYILLS ADKYE I KI WPIGREI NCKCLKTL 
S VSEDRS I CLQPRLHFDGKY1 VCSSALGLYQWDFAS YD ILRVIK 
TP E I ANLALLG FGD I FALLFDNR YLY I MDLRTESL I S RWPLPE Y 

RKSKRGSS FLAGEAS WLNGLDGHNDTGLVFATSMPDHS IHLVLW 
KEHG 

PGAAWSRPDLRGCCTGPQPALRMLVLP ^PCPQPLAFSSVETMEG 
PPRRTCRSPEPGPSSSrGSPQASSPPRPNHYLLIDTQGVPYTVL 
VDEESQR EPGASGAPGQKKC YS CPVCSRVFE YMS YLQRHS ITHS 
EVKPFECD I CGKAFKRASHLARHHS IHLAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHC PRRFMEQNTLOKHTRWKHP 

TT.QnryinTPnQOB t mr* rn/m^T * ~«.,- w 



6403 



6404 



10li 



1690 



"22T 



TTSQCGGIRgsSAI PVASMEFAAI CL-R NALLLLPEBQQDPKQBN 
GAKNSNQLGGNTESSESSETCSSKSHDGDKFIPAPPSSPLRKQE 
LENLKCS I LACSAYVALALGDNLMALNHADKLLQQPKLSGSLKF 
LGHL YAAEALI S LDR I S DA ITHLNPENVTD VS LG ISSNEQDQGS 
DKGF^EAMESSGKRAPQCYPSSVNSARTVMLFNLGSAYCLRSEY 
DKARKCLHQAASMXHPKE\^PPEAILIAVYLELQNGNTQLALQI I 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPI QMPAFTTVORK 
RGIHTSVL^NLQNQMYSHNWIl^I ^NLNLTQVQQRNLITNLQ ' 
RS VDDTSQA I QR IKNDFQNLQQVFLQAKKDTDWLKE KVQS LQTL 
AANNSALAKANNDTLEDMNSQLNS FTGQMENITTISQANEQNLK 
DLQDLHKDAENRTAIKFNQLEERFQLFETDIVNIISNISYTAHH 
LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTLANIRLDSVSLR 
MQQDLMRSRLDTEVANLSVIMEEMKLVDSKHGQLIKNFTILQGP 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KG P PG P PG PS GAWPLALQNE PTPAP EDNS CPP HWKNFTDKCY Y 

FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 
WI GLT DS ERENE WKWLDGTS PDY KNWKAGQPDN WGHGHGPGED C 
AGLIYAGQWNDFQC EDVNNFICEKDRETVLSgAT. 
aa&r.aMnann p/^Y . 



AAALAMAAPAfGI t lsVFSSSQELOAAIAQl,VAQRAJvcCIAGARA~ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

locati on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S«Serine, TaThreonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RFALQLSGGSIiVSMl^ARELPAAVAPAGPASIiARWTIiGFCDERLV - 
PFDHAESTYGLYRTHLLSRLPIPESQVITINPELPVEEAAEDYA 
KKLRQAFQGDS I PVFDLLILGVGPDGHTCSLFPDHPLLQEREKI 
VAPISDS PKP PPQRVTLTLPVLNAARTVI FVATGBGKAAVLKRI 
LEDQEENPLPAALVQPHTGKIiCWFUDEAAARLLTVPFEKHSPli 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAVATLRNHRPR™" 
TAQRAAAQVLGSSGLFNNHGLQVQQQQQRNLSLHEYMSMEIjLQE 
AGVS VP KG YVAKS PDEAYAIAKKLGS KD W I KAQVLAGGRG KG T 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 
VL VCER K Y PRR E Y Y FAI TMERS FQGP VL IGS S HGG VN I E D VAAE 
TPEAI IKEPIDIEEGI KKEQALQLAQKMGFPPNIVESAAENMVK 
LYSLFLKYDATMIEINPMVEDSDGAVLCMDAKINFDSNSAYRQK 
KIFDLQDWTQEDERDKDAAKANLNYIGLDGNIGCLVNGAGLAMA 
TMD 1 1 KLHGGTPANFl,DVGGGATVHQVTEAFXLITSDKKVtiAI L 
VNI FGG IMRCDVIAQG I VMAVKDLEI KI PVWRLQGTRVDDAKA 
LI ADSGLKI IiACDDLDEAARM WKLSE I VTLAKQAH VDVKFQLP 


6406 


1036 


1*7 


HPRQMRGEDTPEAPPYSSGRYDSIKTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDSEGMDPERIiKAFNMFVRJbFVDENLDRM 
VPISKQPKEKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTRPTPPHDTSAMAENILAAACESETRKAAKRMRIiEIYQ 
S S QDE P I ALDKQHS RDSAAI THS T YS LPAS S YSQD PVYANGGLN 
YS YRG YG ALS SNLQ PPAS LQTGNHSNGESG EARAIiAS R PA PSWV 
CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGL ClAVSyr V JjAUJjDALIiVFPGO VAQL6 CTLS PQHVT I RD YG V 

SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 
VLTISPVQPEDDADYYCSVGYGFSP 


6408 


14 58 


903 


RGCITSSQAWRDFGGVTRGFNMRIEKCYFCSGPIYPGHGMMFVR 
NDCKVFRFCKSKCHKNFKKKRKPRKVRWTKAFRKAAGKELTVDN 
S FE FE KRRUE P I K YQR ELWKTKT I DAMKR VEE I KQ KRQAK F IMNR 

LKKNKELQKVQDIKEVKQNIHLIRAPLAGKGKQLEEWVIVQQLQE 
DVDMEDAP 


6409 


1*0 


446 


NTAIiANLLRCFTCDR LCGGCTAPAP PAHQG I VLQ P VM PS CDPGP 

GPACLPTKTFRSYLPRCHRTYSCVHCRAHUVKHDELISKSFQGS 
HGRAYLFNSV 


6410 

6411""" 


85 


607 


RGGTAGCVACLGCWGQS S S P KAAF PAGSACLPADSCP CIiLFQAC 
AISGLFNCITIHPLNIAAGVWMIMNAFIIjLLCEAPFCGQFIEFA 
NTVAEKVDRLRS WQKAVF YCGMAWP 1 VI SLTLTTLIX3NAT AFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLEGEL 




302 


772 


RLS IMASSLNEDPEGSRITYVKGDLFACPKTDSIiAHCISEDCRM ~ 
GAG IAVLFKKKFGGVQELLNQQKKSGEVAVIiKRDGRYI YYLITK 
KRASHKPTYENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 

wenvsamieevfeatdikitvytl 


6412 


61 


1709 

i 
] 


RPVTS FS PLPGS CGGRU3TRTMLGRSLRE VSAALKQGQ ITPTBL 
CQKCL SLI KKTKFLNAY I TVS EEVALKOAEES EKR YKNTttn c; t yir\ 
LDGIPIAVKDNFSTSGIETTCASNMLKGYIPPYNATVVQKIiLDQ 
GALLMGKTNLDEFAMGSGSTDGVFGPVKNPWSYSKQYREKRKQN 
PHSENEDSDWL I TGGSSGGSAAAVSAFTC YAALGSDTGGSTRNP 
AAHCGLVGFKPSYGLVSRHGLIPLVNSMDVPGILTRCVDDAAIV 
LGALAGPDPRDSTTVHE P INKPFMLPS LADVS ICLCIGI PKE YLV 
PELSSEVQSLWSKAADLFES EGAKVIEVSLPHTS YS I VCYHVLC 
TSEVASNMARFDGLQYGHRCDIDVSTEAMYAATRREGFNDWRG 
RILSGNFFLLKENYE^FVKAQKVRRLIANDFVNAFNSGVDVLL 
rPTTLS EAVP YLE F I KEDNRTRS AQDD I FTQAVNMAGLPAVS I P 

/ALSNQGLPIGIjQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
uMDDCS AVLENE KLAS VS LKQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G^Glycine, 
H*Histidine, I=Isoleucine, K^Lysine, 

Leucine, M=Methionine, N=Asparagine, 
?=Proline, Q*Glutaroine, R»Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X*Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 




2 


885 


HEPRCAGMAASLWMGDLEPYMDBNFISRAFATMGETVMSVKIIR - 
NRLIG I PAGYCFVEFADLATAEKCLHKINGKPLPGATPAKRFKL 
NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFFVKVYPSCRG 
G K WLDQTGVS KG YGF VKFTDEL EQKRALTECQ GAVGLG S KPVR 
LSVAIPKASRVKPVEYSQMYSYSYNQYYQQYQNYYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQ S EELYDALMDCH WQPLDTVS S EI PAWM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPKPQPARPSSRATPGPRSPGMATSIGV 
S FS VGDG VPE AE KNAGE PENT Y I LRP VFQQR FRPS WKD CI HAV 
LKEELANAE YS PEEMPQLTKHLSENI KDKLKEMGFDRYKMWQV 

VIGEQRGEGVPMASRCFWDADTDNYTHDVFMNDSLFCWAAFGC 
FYY 


641S 


2 


1168 


FVRQWQSSHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP ' 

FIEAAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 

ENEGSSSEDEDTESSSVSEDGDSS2MDDEDCERRRMECLDEMSN 

LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPIoATLQE 

NMQ I RTKVAG I YRELCIjESVKNKYECE I QASRQHCESEKLLLYD 

TVQSELEEKIRRLEEDRHSIDITSEIiWNDELQSRKKRKDPFWPD 

KKKPGWSGPYIVYMLQDLDILEDWTTIRKAMATLGPHRVKTEP 

P VKLEKHLHSARS EEGRL YYDGEW Y I RGQT I C I DKKDECPTS AV 

ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIKHS 


6416 

■fi417 


410 


1519 


EI APADLE I PACAP VLLS RATS S TMS VTGG KMAPSLTQE I LS HL 
GLASKTAAWGTLGTLRTFLNFSVDKDAQRLLRAITGQGVDRSAI 
VDVLTNRSREQRQLISRNFQERTQQDLMKSLQAAIiSGNLERIVM 
ALLQPTAQFDAQELRTALKASDSAVDVAIEILATRTPPQLQECL 
AVYKHNFQ VE AVDG I TS ETSG I LQDLLLALAKGGRDS YS GI ID Y 
NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELEEAVQNRFHGDAQVALLGLASVIKNTPLYFADKLHQALQ 
ETEPNYQVLIRILISRCETDLLSIRAEFRKKFGKSLYSSLQDAV 
KGDCQSALLALCRAEDM 


o ^» i / 


1 


845 


RGESRVLWbELEGEAGGAGGWASSLNARMDNRFATAFVIACVLS 
L I S T I YMAAS IGTDF W YE YRS P VQENS SDLNKS I WDEF I SDEAD 
EKTYNDALFRYNGTVGLWRRCITIPKNMHWYSPPERTBSFDWT 
KCVS FTI/TEQFMEKFVDPGNHNS GI DLLRT Y L WRCQ FLL P FVSL 
GLMCFGALIGLCACICRSLYPTIATGILHLLAGIjCrrijGSVSCYV 1 
AG I ELLHQKLELPDNVSGE FGWS FCLAC VSAPLQFMASALFI wa 
AHTNR KE YTLMKAYR VA 




6418 


2 


662 


TRTR PRRPPCSijG AAVGKAGARSTS T PAGASP AAA YQ ADPP P~ PAH 
TPAP PP PP P CGG I ACHGE PAKFYG YDNLQRQP I FTTQQEAELVQ 
YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQrLELEKEFLFNPYLTRKRRIEVSHALALTERQVKIWFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAQELEEDRAEGLTN 




6419 


1 


973 


PGRPRVRNFDLNSKS ILQEFFCTRS IQ I PANRSKTAMSKCP I FP 
MARSISTSGPLDKEDTGRQKLISTGSliPATLQGATDSUSLEWHL 

PS PDPVTVPYLS PLWWKELE Q TT ,T?T0T?nriUB t T»T7 a n c*t /r-ktrtr r» rr> 

FWNLVWYFRRLDLPSNLPGLILSSBHCNKYSKIPRHCMSEDSKY 
VL I QMLWDNMKIiHQDPGQPLY I LWNAHTQKYPMVHLLQKSDNS F 
NQELLKSMVKSIKMNDVYGPMSQILETLNKCPHFKRQRSLYREI 
LFL5 Ii VALGREN I D I DAFDKE YKMAYDRLTPSQVKSTHNCDRP P 
STGVMECRKTFGEPYL 




6420 


207 


1187 

< 


RKM I DKNQTCGVGQDS VP YM I CLIH I LEEWFGVEQLED YLNFAN 
YLLWVFTPLILLIIiPYFTIFLLYLTIIFLHIYKRKNVLKEAYSH 
ML WDGARKTVATIiWDGHAAVWHG YE VHGMEKI P EDGPAL 1 1 F YH 
SAI P I DFY YFMAKI FIHKGRTCRWADHFV FK I PG FS LLLD VFC 
MiHGPREKCVEILRSGHLIiAISPGGVREALISDETYNIVWGHRR 
3FAQVAI DAKVP I 1 PMFTQNIREGFRSLGGTRLFRWLYEKFRYP 
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SEQ~ 

ID 
NO: 


1 Predicted, 
beginning 
nucleotide 
location 

[ corresponding 
to first 
amino acid 

1 residue of 

[ amino acid 
J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=:Aspartic Acid, E» 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H*=Histidine, I=Tsoleucine, K^Lysine, 
L=Ijeucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=* Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FAPMYGGF^^RTYI^DPIPYDPQITAEELAEKTKNAVQALID 
KHQRI PGNIMSALLERFH 


6421 
6422 


1 ± Of* ft 


362 


WALSLRRQPKi^SNKLLSPHPHSVVLRSBFKMASSPAVI^ASRL 
YQWSLKS SAQFLGS PQLRQVGQI I RVPARMAATli I L E PAGRCCW 
DEPVR I AVRGI*AP EQ P VTLRAS LRDE KGALFQAHAR YRADTLGE 
LDLERAPALGGSFAGLEPMGLLWALEPEKPLVRLVKRDVRTPIA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRGT 
LFLPPEPGPFPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NyEDLPKTMBTLHLEYFEEAMNYLLSHPEVKGPGVGIiLGISKGG 
ELCI>SMAS FLKG 1 TAAWINGS VANVGGTLRYKGETLPPVG VNR 
NRIKVTKDGYADIVDVLNSPLEGPDQKSFIPVERAESTFLFLVG 
QDDHNWKSEFYANEACKRL<3AHGRRKPQ I ICYPETGHYI EPPYF 
PLCRASLHALVGS P 1 1 WGGEPRAHAMAQVDAWKQLQTFFHKHLG 
GREGTIPSKV 


6423 


181 


2133 


EGENLSWFQEFWGDIAKEFYWKTPCPGPFI^YNFDVT*KGKtFIE~ 
WMKGATTNI CYNVI*DRNVHEKKLGDKVAFYWEGNEPGETTQ1TY 
HQLL.VQ VCQ FSNVLRKQG I HKGDRVAI YMPM I PEL WAMLACAR 
IGALHSIVFAGFSSESLCERILDSSCSLLITTDAFYRGEKIiVNI, 
KELADEALQKCQEKGFPVRCC3VVKHLGRABLGMGDSTSQSPPI 
KRSCPDVQI SWNQGIDIiWWHELMQEAGDECE PE WCDAEDPLF I L 
YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
I G W I TGHS YVTYGP LANGATS VLFEG I PT Y P DVNR L WS I VDKYK 

VTKFYTAPTAIRLLMKFGDEPVTKHSRASLQVLGTVGEPINPEA 
WLWYHRWGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 
FPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETT YFKK FPG Y YVTGDG CQRDQDG YYW I TGR I DDMLNVS GHL 
LS TAEVESAIiVEHEAVAE AAVVGHPHPVKGECLYC FVTIjCDGHT 
FSPKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6424 


614 


1237 


ANIrKE I PRDirPPETVI/L YIiDSNQ ITS XPNE I FKDJj^QLR VLNLS 
KKGI E FIDEKAFKGVAETLQTLDLSDNRIQSVHKNAFNNLKARA 
RIANNPWHCDCTLQQVLRSMASNHETAHNVICKTSVLDEHAGRP 
FtiNAANDADLCNLPKKTT D YAMLVTM FG W FTMVI S YWY YVRQN 
QEDARRHLEYLKSLPSRQKKADEPDDISTW 


^425 


1 


1188 


KKVSWPVAA^rVHCSC^LFRKYGNFIDKLRLFTRGGSGGMG1^PR^r^ 
GGEGGKGGDVWWAHNRMTLKQLKDRYPRKRFVAGVGANSKISA 
LKGS KGKDWE I PVPVG IS VTDENGKI IGELNKENDRILVAQGGL 
GGKLLTNFLPLKGQKR I IHLDLKLI ADVGLVGFPNAGKSS LLSC 
VSHAKPAlADYAFTrLKPELGKIMYSDFKQISVADLPGLIEGAH 
MNKGMGHKFLKHIERTRQLLFWDISGFQLSSHTQYRTAFETII 
LLTKELELYKEEIiQTKPALliAVNKMDLPDAQDKFHELMSQLQNP 
KD FLHLtFEKNM I PERT VE FQHI I P I S AVTGEG I E ELKNC I R KS L 

DEO^NQENDALHKKQLLNLWISDTMSSTEPPSKHAVTTSKMDrT 


6426 h 


1850 ~[~ 


1144 


IiAMEGGGGIPLETLKEESQSRHVLPASFEVNSLQKSMgT^ 
LVGGTLVAVYAVATPFVTPALRKVCLPFVPATMKQIENWKMLR 

crrgslvdigsgdgriviaaakkgftavgyelnpwlvwysryra 

WREGVHGSAKFYISDLWKVTFSQYSNWIFGVPQMMLQLEKKLE 

RELEDDARVIACRFPPPHWTPDHVTGEGIDTVWAYDASTFRGRE 
KRPCTSMHFQLPIQA 


6427 


30 


565 

1 
I 


SRGAAVGGMS VAGGS I RGDTGGE DTAA PGRFS FS PEPTLED I RR — 
LHAEFAAERDWBQFHQPRNLLLALVGEVGELAELFQWKTDGEPG 
PQGWSPRERAALQBELSDVLIYLVALAARCRVDLPLAVLSKMDI 
JRRRYPAHLARSSSRKYTELPHGAISEDQAVGPADIPCDSTGQT 




145 


959 J 


^AS WQPPHVPKAGKMVSWM I CRLVVLVFGMLCPAYAS YKAVKTK " 
riREYVRWMMYWiyFALFMAAEIVTDlFISWFPFYYEIKMAFVL 
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SEQ- 
ID 
NO: 


Predicted ~~ 
beginning 
nucleotide 
■•■oca LJ.on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AeAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


^AOQ 






WLLSPYTKGASLLYRKFVHPSLSRHEKEIDAYIVQAKERSYETV 
LSFGKRGLNIAASAAVQAATKSQGALAGRLRSFSMQDLRSISDA 
PAPAYHDPLYLEDQVSHRRPP1GYRAGGLQDSDTEDECWSDTEA 
VPRAPARPREKPLIRSQSLRWKRKPPVRBGTSRSLKVRTRKKT 
VPSDVDS 




1982 


444 


SGSGGKMEDHQHVPIDlQTSKLLDWLiVDRRHCSLKWQSLVLTlR" 
E K I NTAAI QDMP ESEE I AQLLSGS Y I HYFHCLR I LDLLKGTEAST 
KNIFGRYSSQRMKDWQEIIALYEKDNTYLVEIiSSLLVRNVNYEI 
PSLKKQXAKCQQLQQEYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGELIALVKDLPSQLAEIGAAAQQSLGEAIDVYQASVGF 
VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSWERPHLEEI, 
PEQ VAEDAI D WGDFG VEAVS EGTDS G I S AEAAG IDWGIFPESDS 
KDPGGDGIDWGDDAVALQITVLEAGTQAPEGVARGPDALTLLEY 
TETRNQFLDELMELEaFLAQRAVELSEEADVLSVSQFQIiAPAIL 
QGQTKEKMVT^5VSVLEDLIGKLTSLQLQHLFM1LASPRYVDRVT 
EFLQQ KLiKQS QLLALKKE LM VQKQQE ALE EQAALE P KLD LLLE K 
TKELQKLI EADIS KRYSGRPVNLMGTSL 


6429 


3413 


3442 


E P S S WTAAP RG PLAAHPLE AAVQED DRRALS FD S R I KVFANGTL 
WKS VTDKDAGDYLCVARNKVGDD YWLKVD WMKPAKI EHKEE 
NDHK VFYGGDLKVDCVATGLPNPEI S WSLP0GSLVNSFMQSDDS 
GGRTKRYWFNNGTLYFNEVGMREEGDYTCFAENQVGKDEMRVR 
VKWTAPATI RNKTCLAVQ VP YGD WT VACE AKG E PMPKVTWLS 
PTNKVI PTSSEKYQI YQDGTLLIQKAQRSDSGNYTCLVRNSAGE ' 
DRKTVWIHVNVQPPKINGNPNPITTVREIAAGGSRKLIDCKAEG 
IPTPRVLWAFPEGVVLPAPYYGNRITVHGNGSLDIRSLRKSDSV 
Q L VCMARNEGGEARLI VQLT VLEPM E KP I FHD P I S EKITAMAGH 
TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMLH 
I SGL S S VDAG A YRCVARNAAGHTERIiVS LKVGLKP EANKQ YHNL 
VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 
LDNGTLTVREAS VFDRGTYVCRMETEYGPSVTS I PVIVIAYPPR 
rTSEPTPVIYTRPGNTVKbWCMAMGlPKADITWELPDKSHLKAG 
VQARL YGNRFLHFQGS LT I QHATQRDAGF YKCMAKNI LGSDS KT 
TYIHVF 


6430 


1946 


602 


RTRVS TGLRRTLLWSEAVGAS STRGDTGI PGSGEGGAGPGGGEG~ 

AMLEAMAEPSPEDPPPTLKPETQPPEKRRRTIEDFNKFCSFVLA 

YAGYlPPSKEESDWPASGSSSPIiRGESAADSDGWDSAPSDLRTI 

QTFVKKAKSSKRRAAQAGPTQPGPPRSTFSRLQAPDSATLLEKM 

KLKDSLFDLDGPKVASPLSPTSIiTHTSRPPAAIiTPVPLSQGDLS 

HPPRKKDRiCNRiCLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 

KRKLKKAERGDRLPPPGPPQAPPSDTDSBEEEEEEEEEEEEEMA 

T WGG EAP VP VLPT P PE APRPPATVHPEGVP PADSES KE VGSTE 

TSQDGDASSSEGEMRVMDEDIMVESGDDSWDIiITCYCRKPFAGR 
PMI ECS LCGTW IHL 3 CAK I KKTNVPDFF YCQKCKE LRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCELA.CAMnnc^Ryn'haynntfMP&T'^A^T/i-^ — 

LEEEAIiRRKERLKAIiREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPBPV 
IEEVDIiANLAPRKPDWDLKRDVAKKLEKLKKRTQRAlAELIRER 
LKGQEDSLiASAVDAATEQKTCDSD 


6432 


56 


1692 

i 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
D YSDQE VLQTLTKFCFP F YVDS LTVS Q VGQNFTFVLTD I DS KQR 
FGFCRLSSGAXSCFCILSYLPWFEVFYKLLNILADYTTKRQENQ 
WNELLETLHKLPI PDPGVSVHLS VHS YFTVPDTRELPS I PENRW 
LTEYFVAVDVNNMLHLYASMLYERRILIICSKLSTLTACIHGSA 
&MLYPM YWQHVYI PVIiPPHLLDYCCAPMPYLIGIHI»SIjMEKVRN 
^AliDDWlLNVDTNTLETPFDDLQSLPNDVISSlJCNRljKKVaTT 
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SEQ 
ID 
NO: 


Predicted" 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

cor re sponding 

to first 
1 amino acid 
J residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide " 
{A»Alanine, C-Cysteine, D=Aspartic Acid, B~ 
Glutamic Acid, F» Phenyl a la nine, G=Glycine, 
H»Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
SaSerine, T=Threonine , VaValine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 


^433 






TGDGVARAFIiKAQAAFFGSYRNALKIEFEBPITFCEEAFVSHYR 
SGAMROFLONATOIjOLFlfOPTnf3PT m t Mcnp^cpmrnT^nnv.Tw 

GBYAGSDKLYHQWLSTVRKGSGAIIiNTVKTKAWPAMKTVYKFDI 
AENGCAPTPEEQLPKTAPS PLVEAKDPKLREDRRPITVHFGQVR 
PPR PH WKR PKS NI AVEGRRTS VP S PEQNTI AT PATLH I LQKS I 
TKFAAKFPTRGWTSSSH 




1S24 


484 


AP VT KR KEVFAKDS KGS ALDAGRDPKRPALPE TLCESG WASNTA 
PTT P PQ PG WCL CG KD FKS S CQTP G RE KERRLATMHGS CS FLMLL 
LPLLLLLVATTGPVGALTDEEKRLMVELHNLYRAQVSPTASDML 
HKRWDEELAAFAXA YAR QC VWGHNKE RGRRG ENLFA 1 TDEGMDV 
PLAME E WHHEREH YNLSAATCS PGQMCGH YTQ WWAKTERI G CG 
SHFCEKLQGVEETNriELLVCNYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCEPIGSPEDAQDLPYLVTEAPSFRATEASDSRKMG 
AEGPDKPS WSGLNSG PGHVWGPLLGLLLL P P L VLAGI F 


6434 
6435 


40 


2002 


MPQLNFGMADPTQMGGLSMJLjijtiAGEHALGTPKVFSGTCRPDVSE 1 
SPELRQKSPLFQFAEISSSTSHSDASTKQCQTSALFQFAEISSN 
° >< ru^tf V KKCGKS/Uj* Q JjAEM CLAS EGM KME ES KL I KAKES 

DGGR I KELEKGKEEKEI KMEKTDETRLQKBAEFBKSAKBNLRDS 
KELRNFE ALQ I DD I MA I KMEDPKE I RKEELEEDHKCS H FPDFS Y 

SASSKIIISDVPSRKDHMCHPHGIMIIEDPAALNKPEKLKKKKK 
KSKMDRHGNDKS TPKKTCKKRQSSESD I ES VI YT I EAVAKGDWG 
IEKLGDTPRKKVRTSSSGKGSILEAKPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFZSISASKNISGETPEGIKAEPLTPMEDAIiPPS 
LSGQAKPEDSDCHRKI ETCGSRKSERS CKGALY KTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNBESWTFSQSGTSGSKKFKK 
TKPKEDOjLGSAKLDEEFEKKFNSLPQYSPVTFDRKCVPVPRKK 
K KTGNVS S E PT KTS KGSGDKW SNKQL FLDAI HPTEA I FSEDRNT 

MEPVHKVKNIPSIFNTPEPTTTARTFGGQPKEKSKENPDYSPCO 
DTORAGYHHEEUT.WMTMT .MMWrvmrvT vr\r n!tm*»im.»- 


6436 


2227 j 


657 


ALQRDAAAAYAHPE YEERFLQEETVSQQ INS IELLQTRPLALPE 
WKSQRPLQRQVHLRGRPASQPTVIRGITYYKAKVSEEENDIEE 
QQDE FFSGDNG VDLLi I EDQLLRHNGLMTS VTRRPAATRQGHS TA 
VTS DLNARTAPWS S ALPQP S TS DP S IANHAS VGP TLQTTS VS PD 
PTR ES VLQPS PQ VPATTVAHTATQQPAAPAPPAVS PREALME AM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPASPTLSPBBEDDIRNVI 
GRCKDTLSTirGPTTQNTYGRNEGAWMKDPLAKDERIYVTNYYY 
GNTLVEFRNLENFKQGRWSNS YKLP YSWIGTGHWYNr a w wmo 
AFTRNI I KYDLKQR Y VAAWAMLHD VA YEEATP WRWQGHS DVD FA 
VDBNGL WL I YPALDDEGFSQE VI VLS KLNAADLSTQ KETT WRTG 
LRRN FYGN CF VI CGVLYA VDS YNQRNAN I S YAFDTHTNTQI VPR 
LLFENE Y F YTTQID YNPKDRLIi YAWDNGHQVT YHVI FAY 


6437 


1295 


341 


GACR PP VRQlJ P DSG P D YEALPAGATVTTHMVAGAVAG I LEHC VM 
YPIDCVKTRMQSLQPDPAARYRNVLBALWRIIRTEGLWRPMRGL 
NVTATGAGPAHAL YEACYEKLKKTLS DVIHPGGNSH IANGAAGC 
VATLLHDAAMNPAE WKQRMQM YNS P YHRVTDCVRAVWQNEGAG 
AFYRSYTTQLTMNVPFQAIHFMTYEFtiQEHFNPQRRYNPSSHVL 
S GACAGAVAAAATTPIiDVCKTIiIiNTQESl i ALNSHI TGH I TGMAS 
AFRTVYQVGG VTA Y FRG VQ ARVI YQ I PSTAI AWS VYEF FKYL I T 
KRQEEWRAGK 




1828 


3^0 . 

: 

t 

j 

i 

3 


PPAPAPPASPARHVTRTARGHLEGGSRAPPLUjAVFLQIXNMVK" 
CIHTLADHGDDVKCCAFSFSIiLATCSIiDKTIRLYSLRDFTELPH 
3 PLKFHTYAVHCCCFS PSGH I LAS CSTDGTTVLWNTENGQMLAV 

^EQPSGSPVRVCQFSPDSTCIJ^GAATCTVVLWNAQSYKLYRCG 
3 VKDGSLAACAFS PNGS FFVTG S S CGDLTVWDD KMRCLES EKAH 
)LGITCCDFSSQPVSDGEQGLQFFRLASCX3QDCQVKIWIVSFTH 
C LG FELK YKS TLSGHCAP VLACAFS HDGQMLVSGS VDKS V I VYD 
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ID 
NO; 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 

I nucleotide 

1 location 

1 corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine. G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine / M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine f R=Arginine, 
S=Serine, T= Threonine, v=Valine, 
W^Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Impossible nucleotide insertion) 


" *438 






TNTENILHTLTQHTRYVTTCAFAPNTI*LLATGSMDKT VNI WQFD 
LETLCQARSTEHQLKQFTEDWSEEDVSTWLCAQDLKDLVGIFKM 
KN I DG KELLNLTKES LADDLK I E S LG LRS KVLRKI EELRTKVKS 
LS SG X PDE P I CP I TRELMKD P VI AS DG YS YEKE AMENWD PAKRN 
RTSPP 




109 


I 901 


EVQ I LRAKMFQTGGLI VFYGLIiAQTMAQFGGLPVPLDQTLPLNV 
NPALPLSPTG1AGSLTNALSNGLLSGGLLG1LENLPLLDI LKPG 
GGTSGGLLGGIjIjGKVTS VI PGLNNI IDIKVTDPQLLELGLVQSP 
DGHRLYVTI PLGI KLQVNTPLVGAS LLRLAVKLD I TAE I LAVRD 
KQERI HLVLGDCTHS PGS LQ ISLLDGLGPLP IQGLLDSLTGI LN 
KVLPELVQGNVCPLVNEVLRGLDITLVHDIVNMLIHGLQFVI KV 


6439 
6440 


23 


j 412 


S IQTASAI'XTEMASQSQG I QQLLQAEKRAAEKVADARKRKARRL 
KQAKEEAQMEVEQYRREREHEFQSKQQAAMGSQGNLSAEVEQAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 




3 




RARWNSDMGDLPGLVRIiS IALRIQPNDGPVFYKVDGQRPgQNRT 
IKLLTGSSYKVEVKIKPSTLQVENISIGGVLVPLELKSKEPDGD 
RWYTGT YDTEG VTPTKSG ERQP I Q I TM PFTD IGTFETVWQ VK F 
YNYHKRDHCQWGSPFSVIEYECKPNETRSLMWVNKESFIj 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDALGX 
RNVCKENSTVGMKIQEELQRSGGLDHLVLSPGEWPVSDNTIMHI 
ATAEAliTTDYWCLDDLYREM VRCYVE I VEKLPERRPDPATI EGC 
AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 
L I E VS VE CGRMTHNHPTG F LGSLCTA1*F VS FAAQGKPI/VQWGRD 
M LRAVPLAEE YCR KT I RHTAE YQEHWF YFEAKWQ F YLEERK I S K 
DSENKAI FPDNYDAEEREKT YRKWSSEGRGGRRGHDAPM I AYDA 
LLAAGNSWTELCHRAMFHGGESAATGTIAGCLFGLLYGLDLVPK 
GI>YQDIiEDKEKLEDLGAAI*YRLSTEEK 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGALAGbKTVSSYS 
LQRQS LLDMS L VKLQLCHML VE PNLCRS VLI ANTVRQ I QE EMTQ 
DGTWRTVAPQAAERAPLDRLVSTEILCRAAWGQEGAHPASGLGD 
GHTQGPVSDLCPVTSAQAPRHLQSSAWEMDGPRENRGSFHKSLD 
Q I FETLETKNPS CMEELFSDVDSP YYDIiDTVLTGMMGGARPGPC 
EGLEGLAPATPGPS S S CKSDIjGELDHWE ILVET 


6443 
6444 


2 


555 


MAS PAAS S VRPPRPKKEPQTLVI PKNAAEEQKLKLERLMKNPDK 

AVP I PE KMS EWAPR P PPE FVRDVMGSS AGAGSGEFHVYRHLRRR 

EYQRQDYMDAMAEKQKLDAEFQKRLEKNKXAAEEQTAKRRKKRQ 

KT.KEKKLLAKKMiaEQKKQEGPGQPKEQGSSSSAEASGTEEEEE 
VPSFTMGR 




390 j 


899 


<3S TPRGKMRAP I PE P KPGDI* I E I FRP FYRHWAI YVGDGYVVHLA 
P PS E VAGAGAAS VMS ALTDKA I VKKELL YDVAGSDK YQVNNKHD 
DKYSPLPCSKIIQRAEELVGQEVLYKLTSENCEHFVNELRYGVA 
RSDQVRDVI IAASVAGMGLAAMSLIGVMFSRNKRQKQ 


6445 
6446 


2 


753 


AGAAGAAGAARS PRPQAHTKGVRGIiPSRRRSPDCGRMEIjAAGS F 
SEEQFWEAGAELQQPALAGADWQLLVETSGISIYRLLDKKTGLY 
EYKVFGVLEDCSPTLLADIYMDSDYRKQWDQYVKELYEQECNGE 
TWYWE VKYP FPM SNRD YVYLRQRRDLDMEGRKIHVI LARS TSM 
PQLGERSGVlRVKQYKQSIiAIESDGKKGS KVFMYYFDNPGGQI P 
SWLINWAAKNGVPNFLKDMARACQNYLKKT 




1 


1651 

1 1 
( 

\j 


KUHTKappPDTPGSRGTTAMCSIASGATGGRGAVENBEiJi^lLS— 

D SGDE AAWE DEDDADL PHGKQQTP CLFCNRLFTS AEBTFS HCKS 

EHQFNIDSMVHKHGLEFYGYIKLINFIRLKNPTVEYMNSIYNPV 

PWEKEEYLKPVLEDDLLLQFDVEDLYEPVSVPFSYPNGLSENTS 

/VEKLKHMEARALSAEAALARAREDLQKMKQFAQDFVMHTDVRT 

:SSSTSVIADLQEDEDGVYFSSYGHYGIHEEMIiKDKIRTESYRD 

?I YQNPH I FKDKWLDVGCGTG I LSMFAAKAGAKKVLGVDQSEI 
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| SEQ 
ID 
NO: 


predicted: 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal pept£3e"~ 
(A«=Alanine, C^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
SaSerine, T«Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


f 6447 






LYQAMD1 IRIiNKLEDTITLIKGKIEB VHLPVEKVDVI ISEWMGY 
FLLFESMLDS VL YAKNKYLAKGGS VYPD I CTISLVAVSDVNKHA 
urn AfWUDVYGPKMS CMKKAVI PEA WE VLDPKTL I SEPCGIKH 
IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RWFSTGPQSTKTHWKQTVFLLEKPFSVKAGEALKGKVTVHKNK 
KDPRS LTVTLTLNNS TQT YGLQ 


6448 


1554 


| 1068 


RLGPAEWHL.SGPCHATLGAANRGRALGVRAAWRGAPLCQRWIMP 
S RTNLATG I PSS KVKYSKLS S TDDG YI DLQFKKTPP KI P YKAI A 
LATVLFLIGAFLI I IGSLLLSGYISKGGADRAVPVLI IGILVFL 
PGFYHLRIAY YASKGYRGYS YDDI PDFDD 


6449 


74 


S&9 '- 


GQVLSHCYHYRSSRWRRGGLSRGRGAGVMALVPYEBTTEFGLQK 
FHKPLAT FS FANHT I Q I RQDWRHLGVAAWWDAA I VliS TYLEMG 
AVELRGRS AVELGAGTGLVGI VAALLACR I R YERDNN FLAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL ! 


6450 


597 
848"" 


j 1876 


ii Y G V CJENLRKliE ITGVS CRD V YAKLLHR YRH i LGLWQPDIGPYG 
GLLNVWDG LF I IGWM YL PPHD PHVDD PMR FKPLFR I HliMERKA 
ATVECMYGHKGPHHGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
FRTW-LREEWGRTLEDI FHEHMQELI LMKFI YTSQYDNCLTYRRI 
YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTKITG * 
DPN I PAGQQTVE I DLRHRI QL PDLENQRNFNEIjS RI VLE VRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGE PGDAVAAAKQ P AQCGQGQ P F VLP VGVS S RNEDYPRTCRM 
CF YGTGL I AG HG FT S PERTPG VF I LFDEDRFG FVWLELKS FSLY 
SRVQATFRNADAPSPQAFDEMLKNIQSLTS 


I 6451 




2*9 


tVPAPRTVSGKRSLPGEWEERGEGEQRTCREFSGNGGRAVEAAR 
MRLLCGLWLWLSLLKVLQAQTPTPLPLPPPMQSFQGNQFQGEWF 
VLGLAGNSFR p EHRAL LNAFTATFELS ddgrfe vwnamtrgqhc 
DTWS YVL I PAAQ PGQFT VDHRVW THEQAGR PQDQPAGOJSL VAAS 
RDAG P VHLPGQSSGPLG 


6452 


232 l 


939 


HbPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 

xlpglflgpyssamksklpvlqkhgithiicirqnieanfikpn 
fqqlfrylvldiadnpveniirffpmtkefidgslqmggkvlvh 
gnagisrsaafviayimetfgmkyrdafayvqerrfcinpnagf 

vwyiiQE YEAI YliAKDT I QMMS PLQ I ERS LS VHSGTTGSLKRTHE 
EEDDFGTMQVATAQNG 


6453 


1 


652 


KTKGES SNME PIAA YPLKCS G PRAKVFAVLLS IVLCTVTLFLLQ" 
LKFLKPKINSFYAFEVKDAKGRWSLEKYKGKVSLWNVASDCQ 
LTDRW YLGLKELHKE FG PSH FS VLAFP CNQFGES EPR P SKE VE S 
r^ra^i^v -Irrl FHKIJKIIiGSEGEPAFRFLVDSSKKEPRWNFWK 
YL VNP EGQ WKFWRPEE P I E VI RPDI AALVRQ V 1 1 KKKEJDL 


! 6454 


827 T 
827 -4- 


223 


MKKWLPGLSMSPRJRTLPRPLSLCIiSLCLCLCLAAALGSAQSGSC 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 




223 

) 


HKRWLPGLSMSPRRTLPRPLSLCLSJLCLCLCLAAALGSAQSGSC " 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FS YGMHR VETS CS QCGAHLGH I FDDGPR PTGKR YC I NS AALS FT 
PADSSGTAEGGSGVASPAQADKAEL 




1042 


173 ] 

! 
F 
\ 


^VHLATViiASAAWDALGLPVRSHMQGSTRRMUVMTUVHRRFLQL 
.MTHGVLEEWDVKRLQrHCYKVHDRNATVDKI.EDFINNINSVLE 
> L Y I E I KRGVTEDDGRP I YALVNLATTS IS KMATDFAENELDLF 

tKALELIIDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
r QNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVKlCNICHSIi 



506 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=»Threonine, V» Valine, 
W-Tryptophan, Y»Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSNAEPRCPHCNDYV7PHEIPK 
VPD PEKERESGVLKSNKKS LRSRQH 


6456 


2 


555 


RPQSRS I SMWRNSLLQVSSGLRWLRVCAMVD ILGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAI PKLVI VKQNGE VTTNKGRKQ I RERGLAC FQD WVEAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKE LGKYGVLF YNAC FM 1 1 PTLI I S VSTG 
DLQQATEFNQWKNVVFILQFIjLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAI KNVSVAYIG I LIGGDY I FSLLNFVGLNI CMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSjtCtMKXikFPDFDKKIPV 
KL FPLPLLYVGNH I S GL S S TS KLS L PMFTVLR KFT I PLTLLLET 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGV YTKQKMDPKELGKYGVLFYNACFMI I PTLI I S VSTG 
DLQQATEFNQ WKNVVF ILQFLLS CFLGFLLMYS TVLCS Y YNSAL 
TTAWGAIKNVS VAYIG I L IGGDY I FS LLNFVGLNICMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 


i 892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKllHFPDFDKKIPV 
KLFP L PLLYVGNHI SGLS S TS KLSLPMFTVLRK FTI PLTLLLET 
1 1 LGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANG VYT KQ KMD P KE LGKYGVLF YNACFM 1 1 PTL 1 1 S VSTG 
DLQQATEFNQWKNWFI LQFLLSCFLGFLLMYS TVLCS YYNSAL 
TTAWGAI KNVS VAY I G I L IGGDY I FS LLNF VGLN I CMAGGLR Y 
S FLTLS SQLKPKP VGEENI CLDLKS 


6460 


23 


892 


PTTGFP VTNFPWNWPDGKPPIMtLYVS KLNKI IHFPDFDKKI P V 
KLFPLPLLYVGNHISGLS STS KLSLPMFTVLRKFTI PLTLLLET 
1 1 LGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTL I IS VSTG 
DLQQATEFNQWKNVVFILQFLLSCFIX3FLLMYSTVLCSYYNSAL 
TTAWGAI KNVS VAYIGIL IGGDYI FS LLNFVGLNICMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6461 


1653 


360 


LQQRTLRITAVGQTHPIAW^4AM^PSLGAFYGPASFITFVNCMYF 
LS I F I QLKRHPE RKYELKE P TEEQQRLAANENGE I NHQDSMS LS 
LI STS ALENEHT FHSQLLGAS LTLLL YVALWMFGALAVSLYYPL 
DLVFS F VFGATS LS FS AFF WHHCVNREDVRLAW I MTCCPGRS S 
YS VQVNVQ P PNSNGTNGEAPKCPNS SAE S SCTNKSAS SF KNS S Q 
GCKLTNLQAAAAQ CHANS LPLNSTPQLDNSLTEHSMDND I KMHV 
APLEVQFRTNVHS SRHHKNRSKGHRASRLTVLRE YAYDVPTS VE 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
STLPKS SRNFEKP VSTTS KKDALRKPAWELENQQKS YGLNLAI 
QNGP I KSNGQEG PLLGTDS TGNVRTGLWKHETT V 


4462 " 


3 


773 


SEELDREKKLKEDSPRKTPNKESGVPSL PVSLTS I KEEPKEAJCH 
PDSQSMEES KLKNDDRKTPVNWKDSRGTRVAVS SPMSQHQSY I Q 
YLHAYPYPQMYDPSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYG 
KMSGRE ETE KVNTS PS VNTKTTTESKALDLLQQHANQYRSKS PA 
PVEKATAEREREAERERDRHS PFGQRHLHTHHHTHVGMG Y PL I P 
GQ YDPFQGLTSAAL VAS QQ VAAQASAS GM FPGQRRE 


6463 " 
6464 


2 

12 


350 
1154 


VILCILGGWIFKNADRSMEKKKGEPRTRAEARPWVDEDLKDSSD 
LHQAEEDADE WQES E ENVEH I P FSHNH Y PEKE MVKRS QE FYELL 
NKRRS VRFI SNEQVPMEVIDNVIRTAGL 

G I LRQKERE ERNR I H KKE I LFLEHLL W PS EMSS LSGKVQT VLG 
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NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino < H 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
resiQue ot 
amino acid 
sequence 


Amino acid segment containing signal peptide I 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I * I sol eu cine, K=Lysine, 
L«Leucine, M=Methionine, N«Asparagine, 
P= Proline, Q=Glut amine, R»Arginine, 
S^serine, T=Threonine, V=Valine, j 
""Tryptophan, Y^Tyrosine, X«Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) ! 


r 






JjVEPSKLGRTLTHEHLAMTFDCCYCPPPPCQEAISKEPIVMKNL 
YWIQKNAYSHKENIiQLNQETEAIKEELLYFKANGGGALVENTTT 
G1SRDTQTLKRLAEETGVHIISGAGFYVDATHSSETRAMSVEQL 
TDVLMNE ILHGADGTSIKCGI IGEIGCSWPLTESERKVLQATAH 
AQAQLGCPVI IHPGRSSRAPFQI IRILQEAGADISKTVMSHLDR 
TILDKKEIiLEFAQLGCYLEYDLFGTELLHYQLGPDIDMPDDNKR 
I RRVRLLVEEG CEDR I LVAHD IHTKTRLMKYGGHG YSH I I/TNW 
PKMLLRG I TENVLDK I L I ENPKQWLTFK 


6465 


126 


1356 


KMTVFFKTLRNHWKKTTAGLCLLTWGGHWLYGIOlcr^LLRRAAcH 
QEAQVFGNQL I PPNAQVXKATVFLNPAACKGKARTLFE KNAAP I 
IJILSGMDVTIVKTDYEGO^JCKLLELMENTDVIIVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLAI VKG ETVPLD VLQI KGEKEQP VFAMTGLR WGS FRDAG V 
KVS KYWYLEPLKI KAAHFFSTLKEWPQTHQAS I SYTGPTERPPN 
EPEETPVQRPSLYRRlLRRIiASYWAQPQDALSQEVSPEVWKDVQ 
LSTIEUSITTRNNOLDPTSKEDFLNICIEPDTISKGDFITIGSR 
KVRNPKLKVEGTECLQASQCTIiLIPEGAGGSFSIDSEEYEAMPV 
EVKLLPRKLQFFCDPRKREQMLTSPTQ | 


6466 


1134 


828 


vargtelsqlekahppadmgrrkskrk^ppkkkmtgtletq'ft^ 

PFCNHEKSCDVKMDRARNTGVISCTVCLEEFQTPITYLSEPVDV 
YSDWIDACEAANQ 


6467 


301 


2571 


GEIJlVLtALAHGELACHAVIjTASLLSLRSRLMDSDMDY ER PNVET 
I KCVWGDNAVGKTRL I CARACNATLTQYQLLATHVPTVWAIDQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFSIANPNSLHHVKTMWYPEIKHFCPRAPVILVGCQLDLRY 
ADLEAVNRARR PLAR P I KPN"E ILPPE KGREVAKELG I PYYETSV 

VAQFGIKDVFDNAIRAALISRRHLQFMKSHLRNVQRPLLQAPFL 
PPKPPPPIIWPDPPSSSEECPAHLLEDPLCADVILVLQERVRI 
FAHKIYLSTSSSKFYDLFIjMDLSEGELGGPSEPGGTHPEDHQGH 
SDQHHHHHHHHHGR D FIiLRAAS FD VCES VDEAGGS G PAGLRAST 
SDGILRGNGTGYDPGRGRVLSSWSRAFVSIQEEMAEDPLTYKSR 
LMVWKMDSS I QPGPFRA VL K YL YTGEUDENERDLMH XAH I AEL 
LEVFDLRMMVANILNNEAFMNQEITKAFHVRRTNRVKECIiAKGT 
FSDVTFIE/DDGTISAHKPIiLISSCDWMAAMFGGPFVESSTREW 
FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLIILANRLCLPHLi 
VALTEQYTVTGLMEATQMMVDIDGDVLVFLEIiAQFHCAYQLADW 
CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 
EDHYQRARKERE KED YLHLKRQ PKRRWLF WNS P S S PS S S AAS S S 
SP5SSSAW | 


646B 
6469 


3 


1374 

1 
1 


uawagtnmaalapvgspasrgpriaaglrllpmlgiXqllaepg 

LGRVHHLALKDDVRHKVHLNTFGFFKDGYMWNVSSLSLNEPED 1 
KD VT I G FSI4DRTKNDGFS S YLDEDVN YCI LKKQS VS VTLL I LD I 
S RS E VR VKS P P EAGTQL P KI I FS RDEKVLGQS QE PNVNPAS AGN 
QTQKTQDGGKS KRSTVDS KAMGE KS FS VHNNGGAVS FQFFFNI S 
TDDQEGL YSL YFHKCjbGKEL PSDKFTFSLDIE ITE KNPDS YLSA 
GEIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALL^ 
ITI A L I GTG WAF I KH I LS DKBKKI FM I VI PRR VLANVAYI I IES 
TEEGTTE YGLWKDSLFLVDLLCCGAI LFPWWS I RHLQEASATD 
3KGKFSRAHFVLLSLL 




3 


1374 ] 

] 
] 
i 
( 

1 ^ 


OAWAGTNMAAXAP VGS PAS RG PRLAAGLRLL PMLGLLQLLAEPG I 

[jGRVHHIiALiaJDVRHKVHIiNTFGFFKDGYMVVNVSSI>SLNEPED 

tDVTIGFSUDRTiOIDGFSSYLDEDVNYClLKKQSVSVTLLILDI 

5RSEVRVKS PPEAGTQLPKI I FSRDEKVLGQSQEPNVNPASAGN 

JTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 

TODQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYIiSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding- 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid Segment containing signal peptide 
(A^Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, Q=Glycine, 
H-Hiatadine, I^Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=:Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEIPLPKLYISMAFFFFLSGTIWIHlLRXRRNDVFKIHWLMAAIi 
PFTKSLSIiVFHAIDYHYISSQGFPIEGWAWYVlTHLLKGALLF 
I TI AL I GTG WAF I KHX L S DKDKK I FM I VI PRRVIiANVAY 1 1 1 BS 
TEEGTTEYGLWKDSL FLVDLLCCGAILFP WWS I RHLQEASATD 
GKGKFSRAHFVLLSLL 


6470 


2726 


1437 


AAASGVSSRADAPVLAQSPASAGNGRPSTPRVPGSRRHPSAPRS 
GPLPREDGCRTPGPQLLPLPGALLRPRTLLSSAAETGRSRHPDT 
QHPSSGGRCRGGTESPSSAAGRPASMAEAEEDCHSDTVRADDDE 
EUES PAETDLQAQLQMFRAQWM felafg vss snlenr pcraarg 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDIEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLI^S 
YFQQQLTFQESVLKLCQPBLESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKIiV 
P YTS WREM FL ERPR VR FDG VY 1 5 KTTYI RQGEQS LDGF YRAWHQ 
VE YYRY I R F F PDGHVMMLTTPEE PQS I VP RLRTR 


6471 


1750 


I 299 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GP RNKKRGWRRLAQEPLGLE VDQ FLED VRLQE RTS GGLLS EAPN 
EKLFFVDTGS KEKGLTKKRTKVQKKSLLLKKPLRVDLI LENTSK 
VPAPKDVLAHQVPNAKKLRRKEQLWEKIAKQGELPREVRRAQAR 
LLNPSATRAKPGpQDTVERPFYDLWASDNPLDRPLiVGQDEFFLE 
QTKKKGVKRPARLHTKPSQAPAVEVAPAGAS YNPS FEDHQTLLS 
AAHEVELQRQKEAEKLERQIALPATEQAATQESTFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARIATTEKKTEQQRRREKA 
VHRLRVQQAALRAARLRHQELFRLRG I KAQVALRLAELARRQRR 
RQARREAEADK PRRLGRL KYQAPDI D VQLSS ELTDSLRTLKP EG 
NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP 
ARVDLQQQIMTI I DELG KAS AKAQNLS AP I TS AS RMQSNRHW Y 
ILKDSSARPAGKGAI I G F I KVGYKKL FVLDDREAHNE VE PLC I L 
DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 
LNKHYNLETTVPQVNNFVI FEGFFAHQHRPPAPSLRATRHSRAA 
AVDPTPAAPARKLPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 
RAPRRATP PAHP PPRSSSLGNSPERGPLRPFVP 


6473 


22 


912 


SSAVEFVWEGEKMAAEPNXTEIQTLFKRLRAVPTNKACFDCGAK 
NPSWASITYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFQ 
LR CMQ VGGNANATAFFRQHGCTANDANTKYNS RAAQM YREKI RQ 
Jjoo/VrtJjAKttvjr 1 DLWI DNMSS AVPNHS PE KKDSD F FTEHTQP PAW 
DAPATEPSGTQQPAPSTESSGLAQPEHGPNTDLLGTSPKASLEL 
KSS 1 1 G KKKP AAAKKGLGAKKGLGAQ KVSSQ S FSE I ERQAQ VAE 
KJ^EQQAADAKKQAEESMVASMRLAYQELQIDR 


6474 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PS TLS VKGQ I E TVRVKGTBN 


6475 
~~6476 " 


3 


462 


bgRgRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 




106 


1090 

< 
! 
1 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMEVLKQRIAEETIL 
KSQVDKRFSAHYDAVEAELKSSTVGLVTLNDMKARQEALVRERE 
RQLAKRQHLEEQRLQQERQREQEQRRERKRKISCLSFALDDLDD 
3ADAAEARRAGNIX3KNPDVDTSFLPDRDREEEENRLREELRQEW 
BAQREKVKI>EEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
3GLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIARARGK 
5GPLFSFDVHDDVRLLSDATMEKDESHAGKWLRSWYEKNKHIF 
3 AS R WEAYD P EKKWD KYTI R 
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SEQ 
ID 
NO: 

' 6477 ■ 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C~Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L^Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptcphan, Y^Tyrosine, X*Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




227 


915 


lqghlmg imaas rplsrfwewgkni vcvgrnyadhvremrsavl 
sepvlflkpstayapegspilmpaytrnlhhelelgwmgkrcr 
avp e aaamd yvgg yal cldmtardvqde ckkkgl pwtlaks fta 
scpvsafvpkekipdphklklwlkvngeiirqegetssmifsipy 

IIS YVS Kl I TL EEGD 1 1 LTGTP KGVG PVKENDE I EAG 1 HGLVS M 
TFKVEKPEY 


6478 


2 


1495 i 


F VS SRI LPE S LASS EAS TLE AMGRKE EDDCS S WKKQTTNI R KT F 
IFMBVLGSGAFSEVFLVKQRLTGKLFALKCIKKSPAFRDSSLEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RGVYTEKDASLVIQQVLSAVKYLHENGIVHRDLKPENLLYLTPE 
ENS K IMITDFGLS KMEQNG IMS TACGTPG YVAPEVLAQKP YS KA 
VDCWSIGVITYILLCGYPPFYEETESKLFEKIKEQYYEFESPFW 
DDISESAKDFICHLLEKDPNERYTCEKALSHPWIDGNTALHRDI 
YPSVSLQIQKNFAKSKWRQAFNAAAWHHWRKLHMNLHSPGVRP 
EVENRPPETQASETSRPSSPEITITEAPVLDHSVALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGS LH I S S S L VFMHQG S LAAG PCGCC 

SSCLNIGSKGKSSYCSEPTLLKKANKKQNFKSEVMVPVKASGSS 
HCRAG QTGVCL I M 


6479 


3 


949 SCRGPGWHPAGGQAGAMELLSALSLGELAIjSFSRVPLPPVFDLS 
YFIVSILYLKYEPGAVELSRRIIPIASWbCAMLHCFGSYILABLL 
LGEPLIDYFSNNSSILIASAVWYLIFFCPLDLFYKCVCFLPVKL 
I FVAMKEWRVRKI AVG IHHAHHHYHHGWFVM IATGWVKGSGVA 
LMS NFEQLLRG VWKP ETNE I LHMS FPTKAS LYGAI L FTLQQTRW 
LPVSKASLI FIFTLFMVSCKVFLTATHSHSS PFDALEGYI CPVL 

FGSACGGDHHHDNHGGSHSGGGPGAQHSAMPAKS KEELS EG SRK 
KKAKKAD 


6480 


192 


514 DFMSj.Xi?PIHCPDYLRSAKMTEVMMNTQPMEEIGLSPRKDGLSY 
QIFPDPSDFDRCCKLKDRLPSIVVEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 
"6482 


110 


1131 KS RMDLDWNM FV I AGGTLAI P I LAF VAS FLLW PS AL I R I Y YWY 
WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPS I LMLHGFS AHKD 
MWLS WKFIi PKNLHL VC VDMPGHEG TTRS S LDDL S I DGQVKR I H 
QFVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPSDVSSLWLVCP 
AGLQYSTDNQFVQRLKBLQGSAAVEKIPLIPSTPEEMSEMLQLC 
S YVRFKVPQQ ILQGLVD VR I PHNNFYRKLFLE I VSEKSRYSLHQ 

NMDKIKVPTQIIWGKQDQVLDVSGADMLAKSIANCQVELLENCG 
HSWMERPRKTAKLUDFIASVHNTDNNKKLD 


6483 


2517 


568 E P VS KVSQSRRKAG VP TAN I EE S O^VEAAMANVP WAE VCEKFQA 
ALALSRVELHKNPEKEPYKSKYSARALLEEVKALLGPAPEDEDE 
RPEAEDGPGAGDHALGIiPAEWEPEGPVAQRAVRLAVIEFHLGV 
NHIDTEEI.SAGEEHLVKCLRLLRRYRLSHDCISLCIQAQNNLGI 
LWSEREEIETAQAYLESSEALYNQYMKEVGSPPLDPTERFLPEE 
E KLTEQERS KRFE KVYTHNL YYIiAQVYQHIi EMFEKAAH YCHS TL 
KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 
FGQrGKI SATEDTPEAEGE VPEL YHQRKGE I ARCW I KYCLTLMQ 
NAQLSMQDNIGELDLDKQSELRALRKKELDEEES IRKKAVQFGT 
GELCDAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 
IDGYVTDHIEWQDHSALFKGLAFFETDMERRCKMHKRRIAMLB 
PLTVDLNPQYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SHIVKKINNLNKSAIiKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLAKFRVARLYGKIITADPKKELENLATSLEHYKFIVDYCEKHP 
[ EAAQEIEVELBLSKEMVSLLPTKMERFRTKMALT 




3 


623 iNaHltljLXjljRARAPLSANGREARAMEQRLABFRAARKRAGIiAAQP " 
PAAS QGAQT PGEKAE AAATLKAAPGWLKRFLVWKPRP ASARAQP 
GLVQEAAQPOGSTSETPWNTAIPLPSCWDQSFLTNITFLKVLLW 
L LVLLGLFVBLEFGLAYFVLSIjFYWMYVGTRGPEEKKEGEKSAYS 1 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o r r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R*=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
" VFNPGCEAIQGTLTAEQLERELQIiRPliAGR 


6484 


201 




QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGESIGllCPPCQRLF 
MILWLKGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 
DFIKIEEFI*EOTLAPPRVPHT.QPIfVTf PCi?nirr'r»NiT pavcc^vtv 

ntqkeanknfe ksllke fkrlddylntpijldeidpdsaebppvs 
rrlfldgdqltladcs ll pklni i kvaakkyrdfd i paefsg vw 
rylhnayareefthtcpedkeientyanvakqks 1 


6485 


6 


1091 


FVDL\^VEFLPCPDSUKLEKBCQSSEESMGSNSMRSILEEDEE'H 
ustct c ri* vuu i niif Ki» * fc, VGKUjvwHKHKKYPFWPAVVKSVRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSIiITDYRVRLGCGSFAGSFLSYYAADISYPVRKSIQ 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRKMLPDRSRAARD 

CVETYIiEDEGQLDLWKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VLLPSAI I CAISAGDEVDYKTAEEKYIKGPSLS YREKE X FDNQL 
LEERNRRRR j 


6486 


; 10 


581 


LVLQAGGAHLSPSRVTQGIVTYMIJVF^EMPKPPDYSELSDSI^^ 
**** iWf *wirxjntc^«xi w u v iiN rKUKMCaWXGVGljYIiliASAAAFYY VFE 1 
ISETYNRLALEHIQQHPEEPLEGTTWrHSLiKAQLIiSLPFWVWTV 
I FLVP YLQMFLFLYSCTRADPKTVG YC 1 1 P 1 CLAVI CNRHQAFV 
KASNQISRLQlilDT 


6487 


352 


863 


SFIjKPLRGKMSVTIjHTDVGDIKIEVFCERTPKTCENFIiAiCA^N 
YYNGCI FHRNI KGFMVQTGDPTGTGRGGNS I WGKKFEDE YSEYL 
KHNVRG WSMANNGPNTNGS QFF I T YGKQPHL DMKYTVFGKV ID 
GLETLDELEKLPVNEKTYRPLNDVHIKDITIHANPFAQ \ 


6488 


878 


241 


TALQEPGTSGPPl^LRFALPSGTGRFKPLFGARGPSWPPSPRVpH 

MEPPWT iVPVPTT . YT/"VriT.C \rr*i addt a -r> x\jtr ht/ai t-»/-«wi» »».<•. 1 
* ^ J_) x r' v rvjj i v IWiJOKVjiaAKRijSPIMI^KQLEGIWHTSIVVH 

KDEF F FGSGG I SSCPPGGTLLGPPDSWDVGSTEVTEEI FLE YL 

SSLGESLFRGEAYNLFEHNCNTFSNEVAQFLTGRKIPSYITDLP 

SEVLSTPFGOALRPLLDS IQIQPPGGSS VGRPNGQS ! 


6489 


1457 


375 


kvakmatalseeeudnedyysixnvrreasseelkaayrrXoc^ 

yhpdkhrdpblksqaerlfnlvhqayevlsdpqtraiydiygkr 

glemegwewerrrtpaeireeferlqrereerrlqqrtnpkgt 

ISVG VDATDLFDR YDEEYEDVSGSS FPQI EINKMHI SQS I EAPL 
TATDTAI LSGSLSTQNGNGGGS INFAIiRR VTSAKGWGELE FGAG 

DLOGPLFGLKLFRNljTPRC , F*VTTMr , ZVT.ni?Ge'D<^TT>eir'r toittt »n f 

NLDKNTVGYI^VraCSSPLLQVQRPHRNTRACAPEPSFliPFLHVP 
TW13AECSGARTPSTAMTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPrETMFVARSIAADH 
KDL I HD VS FDFHGRRMATCS S DQS VKVWDKSESGDWHCTAS WKT 
HSGS VWRVTWAHPE FGQVLAS CSFDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTL VDSRTS VTD VK FAPKHMGLMLATCS ADGI VR I YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAKSPMIAVGSD 
DSS PNAMAKVQ I FE YMENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQVWRVSWNITGTVIASSGDDGCVRLWKANYMDNWKCTGII4 
KGNGSPVNGSSQQX3TSI^SLGSKIPSLQyrStiNGSSAGRKHfi 


6491 


3 


1183 

] 


heagcewlgygpraaaaaaatvlfggagptetmfvarsiaadhH 

KDL I HDVSFDFHGRRMAT CSS DQS VKVWDKSESGDWHCTAS WKT 
HSGSVWRVTWAHPEFGQVLASCSFDRTAAVWEEIVGESNDKLRG 
QSHWVKRTTLVDS RTS VTD VK FAP KHMGLMLAT CS ADG I VR I YE 

APDVMNLSQWSLQHErSCKLSCSCISWNPSSSRAHSPMlAVGSD 
DS S PNAMAKVQ I FE YNENTRKYAKAETLMTVTDP VHD IAFAPNL 
3RSFIIIIiAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
^SQVWRVSWNITGTVIASSGDIXSC^LWKANYMDNWXCT^ 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide' 
(A*=Alanine, C=Cysteine, D«Aspartic Acid, Eh 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glut amine, R«Arginine, 
S ^Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 

KGNGS P VNGS SQQGTSN PS LGSNI P S LQNS LNG S S AGRKHS 1 


6492 


34 


2573 


IPFLKSCCCCCIiFDFPPPPLDQVQEEECE^ERVTEHGTPKPPRK 

PDSVAFGESQSEDEQPENDLETDPPNWQQLVSREVLLGLKPCEI 

KRQEVINELFYTERAHVRTLKVLDQVFYQRVSREGILSPSELRK 

IFSNLEDILQLHIGLNEQMKAVRKRNETSVIDQIGEDLLTWFSG 

PGEEKLKHAAATFCSKQPFAIiEMIKSRQKKDSRFQTFVQDAESN 

P LCRRLQLKD I 1 PTQMQR LTKYPLLLDNIAT YTE WPT ERE KVKK 

AADHCRQI LNY VNOAVKEAENKQRLE D YQRR LDTS S LKLS EYPN 

v ajkn lajl> 1 KKKMl HEGPLWKVNRDKTIDLYTLLLED I LVLL 

QKQDDRLVLRCHSKI LASTADSKHT FS P VI KL STVLVRQVATDN 

KALFVISMSDNGAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDOG 

LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 

^ * w amj. if i^onUFVSEEKWALDALRNLGLLKQLLVQQLGLT 

EKSVQEDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSGEGHMP 

FRTGTGDIATCYSPRTSTESFAPRDSVGLAPQDSQASNILVMDH 

MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 

EEKDVNLRISGNYLILDGYDPVQESSTDEEVASSLTLQPMTGIP 

AVESTHQQQHSPQNTHSDGAISPFTPEFLVQQRWGAMEYSCFEI 

QSPSSCADSQSQIMEYIHKIEADLEKLKKVEESYTILCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGS S TS DCMSKTLDS AS AHFAAS A WSAP VPSRS EVA 
KEQNTGHNNINGWQPSGTS KTLYS TNMALS SS PG I S AVQL VRT 
VGHTTTNHLI PALCTSSPQTLPMNNSCLTNAVHLNNVS WS PVN 
vriXi ^ 1 Kl "wbi? i^^KJjATVAASMDRVPKVTPSSAI SS IARENH 
EPERLGLNGIAETTVAMEVT 


64 94 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
L KG KVIj I CRN YRGDVDMS E VEH FMP I LME KE EEGMLS P I LAHGG 
VRFM W I KHNNLYLVATS KKNAC VS LVFS FLY KWQ VFS E Y FKEL 
SEESIRDNFVIIYELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPRPPATVTNAVSWRSEGI KYRKNEVFLDVI BSVNLLVSAN 
GNVLRS E I VGSI KMRVFLSGMPE LRLGLNDKVLFDNTGRGKS KS 
VE LED VKFHQC VRLSRFENDRT I S FI PPDGEFELMSYRLNTHVK 
PLI W I E S VI EKHSHS RI E YM I KAKSQ FKRRSTANNVE I H I P VPN 
DADSPKFKTTVGSVKWVPENSE I VWS I KSFPGGKE YLMRAHFGL 
PS VEAEDKEGKPPISVKFEXPYFrTSGTrtVJ? vr.jfT 7 j?v<:pvnBT 
PWVRYITQNGDYQLRTQ 


6495 
"6496 " 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRppAAtxSASAVYVLD 
LKGKVL ICRNYRGDVDMSEVEHFMPlIiMEKEEEGMT.Q ott t\vnn 

VRFMWIXHNNLYLVATSKKNACVSLVPSFLYKVVQVFSEYFKEL 
EEESIRDNPVIIYELLDELMDFGYPQTTDSKIIjQEYITQEGHKL 
ETGAPR PPATVTNAVSWRSEG I ICYRKNE VFLDVlESVNIiLVSAN 
GNVLRS E I VGS IKMRVFLSGMPELRLGItNDKVLFDNTGRGKSKS 
VELEDVKFHQ C VRLS R FENDRTI S F I PPDGE FELMS YRLNTHVK 
PLI WIES VI EKHSHSR IE YMI KAKS QFKRRS TANNVE IHI P VPN 
DADSPKFKTTVGSVKWVPENSEIWSIKSFPGGKEYLMRAHFGL 
PS VEAEDKEGKPP I S VKFE I P YFTTSGIQVRYLKI IEKSGYQAL 
PWVRYITQNGDYQLRTQ 


6497 


247 


559 


LRAVSLLPLQLVLPEYSIHSLFCIMFLCAQEWLTLGLNVPLLFY^ 

HFWRYFHCPADSSEIiAYDPPVVMNADTLSYCQKEAWCKLAFYLL 
SFFYYLYCMIYTLVSS 




1053 


352 

] 


ANTgiCRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGBSGV^ 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
rAGLER YRAI TS AYYRG AVGALLVFDLTKHQ T YAWER WLKE LY 
DHAE AT I VVMLVGNKSDLSQAREVPTE EARMFAENNGLLFLE TS 
\LD S TNVELAFE TVL KE I FAKVS KQRQNSIRTNAI TLGSAQAGQ 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cyeteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M»Methionine, N=Asparagine , 
P^Proline, Q«Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
(^Tryptophan, Y= Tyrosine, X=Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\«=possible nucleotide insertion) 








BPGPGEKRACCISL 


6498 


2636 


272 


SLRLCPWGTHLAGPTTMRLSSLIALLRPALPLILGLSLGCSLSL 
LRVSWIGGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VPYYRDPNKPYKKVLRTRYIQTELGSRERLLYAVLTSRATLSTL 
AVAVNRT VAHHF PRLL YFTGQRGARAPAGMQVVS HGDERPAWLM 
S ETLRHLHTHFGADYDWFFIMQDDTYVQAPRIiAALAGHLS INQD 
LYLGRAEEFIGAGEQARYCHGGFGYLLSRSLIjIjRLRPHLDGCRG 
DI LSARPDEWLGRCH DSIiGVGCVSQHQGQQYRS FELAKNRDPE 
KEGSSAFLSAFAVHPVSEGTU4YRLHKRFSALELERAYSEIEQL 
QAQ I RNLTVLTPEGEAGLS WP VGL PAPFTFHSRFEVLGWD YFTE 
QHTFSCADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LRPLS RVE 1 LPMPYVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRLAW1AVRAEAPSQVRLMDVVSKKHPVDTLFPLTTVWTRPG 
PE VLNRCRMNAISGWQAFFP VHFQEFNPALS PQRS PPGPPGAGP 
DP PS P PG AD PS RGAP IGGRFDRQ AS AEGCF YNAD YLAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSJjRD 
CSPRLSEELYHRCRLSNLEGLGGRAQIiAMALFEQEQAKST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGLRAAAGAFRTGSPLALGPETPQVACIjP 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGSISRAEAEEHLKLAGM 
ADGLFLLRQCLRSLGG YVLSIiVHDVRFHHFP I ERQLNGTYAIAG 
GKAHCG P AEL CE F YSRD PDGLPCNLRKPCNR PSGLEPO PGVFDC 
LRDAMVRDYVRQTWKLEGEALEQAI 1 S QAPQVEKL IATTAHERM 
PWYHSSLTREEAERKLYSGAQTDGKFbliRPRKEQGTYALSLIYG 
KTV YHYL ISQDKAGKYC I PEGT KFDTIjWQL VE YL KQCADGL I YC 
LKE ACPNSSASNASGAAAPTIiPAHPSTLTHPQRR I DTLNS DG YT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
LIAD IELGCGNFGSVRQGVYRMRKKQ I DVAI KVLKQGTE KADTE 
EMMREAQ I MHQL DNP Y I VRL I G VCQAE ALMLVMEMAGGG PLHKF 
LVGKRE E I PVSNVAE LLHQ VS MGMKYIjEEKNFVHRDLAARNVIjIj 
VNRHYAKI SDFG LS KALGADDS YYTARS AG BCt^P LKW YAPE C INF 
RKFSSRSDVWSYGVTMWEAIiSYGQKPYKKMKGPEVMAFIEQGKR 
MECP PE C P P Eli YAXiMSDCW I YX WEDRPD FLTVEQRMRACY YS LA 
S KVEGPPGSTQKAEAACA 


6500 


1773 


726^~ 


TGPTHASADAWGLVRSVTBWCANVRGNPCAAALSCPQAVLDAGK 
MLS E S S S FLKG VMLG S IFCAL I TMLGH I R I GHGNRMHHHEHHHL 
QAPNKE D I LK I S EDE RMELSKSFR VYC 1 1 LVKPKDVSLWAAVKE 
TWTKHCDKAEFF S S ENVKVFE S I NMDTNDMWLMMR KAYKYAFDK 
YRDQYNWFFLARPTTFAI I ENLKYFLLKKDPSQP FYLGHT I KSG 
DLEYVGMEGGIVLSV2SMKRLNSLLN I PEKCPEQGGMI WKISED 
KQLAVCLK YAG VFAENAEDADGKDVFNTKS VGLS I KEAMTYHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 




6501 


1 


570 


LVGMSGGGTETPVGCEAAPGGGSKKRDSIjGTAGSAHLIIKDLGE 
IHSRLLDHRPVIQGETRYFVKEFEEKRGLREMRVLENLKNMIHE 
TNEHTLP KCRDTMRDS LS QVLQRLQAANDS VCRLQ QREQER KKI 
HSDHLVASEKQHMLQWDNFMKEQPNKRAEVDEEHRKAMERLKEQ 
YAEME KDLAKFSTF 




6502 


213 


16S0 


AGNK PDP WAGRNRTAVL PDVS VFHRED VG WWRSW LQQ S YQAVKE ' 
KSSEALE FMKRDLTEFTQ WQHDTACT I AATAS WKE KI*ATEGS 
SGATE KMK KG LSD FLG VI SDTFAPS PDKT IDCDV I TLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWIiSQFCIiEEK 
KGEISELLVGSPSIRAIiYTKMVPAAVSHSEFWHRYFYKVHQLEQ. 
EQARRDALKQRAEQS I SEEPGWEEEEEELMGIS PI S PKEAKVPV 
AKISTFPEGEPGPQS PCEENLVTS VEPPAEVTPSESSES ISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGIAVDVGETGPSPP 
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* oeyiuem- containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
HsHistidine, I^Isoleucine, K=Lysine, 
L=Leucine, K=Methionine, N=Asparagine, 
P= Proline, Q=Giutamine, R=Arginine, 
SaSerine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
iW * F ^w™^GM3&s> lux &auwnis±tr JJLiDMTEEeVQWAIiSKVDASG 
EVSGPGGSEGS EPNGPGCESS PQPAQLSPQBGPCSCLR 


6503 


213 


1650 


agnkpdpwagrnrtavlpdvsVfhredvgwwrswlqqsyqavke 

KS S E ALE FM KRDLTE FTQWQHDTACT I AATAS WKE KLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGE I SELL VG S PS I RAL YT KMVPAA VSHSE F WHR YF YKVHQLE Q 
nVAKrci^/ujj^^KAJaQSISEEPGWEEEEEKLMGISPISPKEAKVPV 
AKI STFF EGEPGPQS PCEENLVTS VEPPABVTPS ESSES IS LVT 
Q I ANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETG PS PP 
IHS KPLTPAGHTGG PE PRP PARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTBEEVQMALSKVDASG 
E VS G ?GG S EGS E PNG PGCESS PQ P AQLS PQEG P CSCLR 


6504 


2131 


1294 


GKVC-jVAHW VCLS ILS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
OAwn i KJvAvSQKKQRGRPSSQPCRNIVGCRI SHGWKEGDEP ITQ 
WKGTVLDQVP IN PSL YL VK YDG IDCVYGLELHRDERVLSLKI LS 
DRVASSHISDANLANTI IGKAVEHMFEGEHGSKDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
RE PGG WDGL I GKHVE Y T KEDGS KR I G MVIHQVEAKPS VYF I KF 
DDDFHI YVYDLVKKS 


6505 


2131 


1294 


GKVCLVAHWVCLSILS P PPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMT KKKVS QKXQRGR PS SQPCRNIVGCR I SHGWKEGDEP ITQ 
WKGTVLDQVP I N PS LYLVK YDG I DC V YGLELHRDER VLS LK I LS 
DRVASSHISDANLANTI I G KAVEHM FEGEHGS KDE WRGMVLAQA 
PIM KAWFYI T Y EKD P VL YM YQLLDD YKEGDLR I M P ES SBS P PTE 
kisk^Ij v v ucl IGKHVE YTKEDGS KR I GMVI HQVEAKPS VYF I KF 
DDDFH I YVYDLVKKS 


6506 


1 


1350 


KVSPPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGST 
ELVEDSHYSQSQLVCSDCGCVVTEGVLTrTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDLCRVLQLPPTFBDTAVAYYQQAY 
RHSGIRAARLQKKEVLVGCCVLITCRQHNWPLTMGAICTLLYAD 
LDVFSSTYMQIVKIO/SLDVPSLCIJ^LVKTYCSSFKLFQASPSV 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVI TAATFLA 
"W^^s^^^w«jjt>^t>JLi/\Kr i KliAN VDLP YPASS RLQELLA VLLRMA 
EQLAWLRVLRLDKRSWKHIGDLLQHRQSLVRSAFRDGTABVET 
R EKEPPG WG QG QGEGSVGNNS LGLPQGKRPAS PALLLPPCML KS 

PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 


6507 


1878 


929 


RSHASRLPELPSGCLVLQVQELVQMSGMEAT VTI P I WQNKPHGA 
ARS WRRIGTNLPLKPCARAS FETLPNISDLCLRD VPP VPTLAD 
I AWIAADEEETYARVRS DTR PLRHTWKPS PLI VMQRNAS VPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTFDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITBETEVE 
VPELPSVPLLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMGILKDFHRMKQSQDLNRSLLKE EDPAVL I SEVLRRKFALKE 
EDISRKGN 


6508 _ 
6509 


862 


342 " " " 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTFRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQ VHVWDACS S RSQVDRLVALARMRQSGAFLS TSEGL 
I LQ L VGDAVHPQFKE I QKL IKE PAPDSGLLGLFQGQNS LLH 




2 


1053 


fv^vnprggrkrrrqaavtqaatrasgtpsprdgtmtOgklsvan 

KAPGTEGQQQVHGEKKEAPAVPSAPPSYEEATSGEGMKAGAFPP 

\ptavplhpswayvdpsssssydngfptgdhelfttpswddqkv 

RRVFVR KVYT I L L IQLLVTLAWAL FTFCDP VKDYVQANPG WYW 

«vsyavffatyltlaccsgprrhfpwnlilltvftlsmayltgml 
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Amino acid segment containing eianal neot-iH^ 
<A~Alanine, C«Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H«=Histidine, I=»Isoleucine, Ka Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S= serine, T«=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X^Unknown, *~Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


I" 6510 






SS Y YNTTS V biiCDG I TAL VCLS VT VFS FQTKFDFTS CQGVL F VL 
LMTIiFFSGLIIAILLPFQYVPWLHAVYAALGAGVFTLFLALDTQ 
LLMGNRRHSLSPEE Y I FGALNI YLD1 1 Y I FTFFLQLFGTNRE 




37 


1156 


P CALDGCPy KGAVH PIjLSSAMGLLAFIjKTQFVLHLIjVG FVF VVS 
GLVINFVQLCTLALWPVSKQLYRRLNCRLAYSLWSQLVMLLEWW 
SCTECTLFTDQATVERFGKBHAVIIIiNHNFEIDFI,CGWTMCERF 
GVLGSSKVIAKKELLYVPLIGWTW YFLE XVFCKRKWEEDRDTW 
EGLRRIjSDYPEYMWFLLYCEGTRFTETKHRVSMEVAAAKGLPVL 
K^^^PRTKGFrTAVKCLRGTVAAVYDVTT.MS , pr!TS3ifMT>OT tpty 1 

YGKKYEADMCVRRFPLEDIPLDEKEAAQWLHKLYQEKDALQEIY 
NQKGMFPGEQFKPARRPWTLLNFLSWATILLSPLFSFVLGVFAS 
GS PLLILTFLG FVGAGNGHCR 


j 6511 
6512 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPI,KGGK 
TMATNFSDI VKQGYVKMKSRKLGT YRRPWr.vvw trc cc trponnr » 

KYPDEKSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAIIFTDD 
SARTFTCDSELEAEEWYKTLSVECLGSRLNDISLGEPDLLAPGV 
QCEQTDRFNVFLLPCPWI*DVYGE cklqi theni ylwdihnprvk 
LVS WPLCSLRRYGRDATRFTFEAGRMCDAGEGLYTFQTQEGEQ I 
YQRVHSATLA1AEQEKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
MLPRSAYWHH I TGSQN I AE ASS YAGEGYGAAQAS SETDLLNRF I 
LLKPKPSQGDSSEAKTPSQ I 


r 6S13 " 


159 


807 


FGKKSTWF^i^RSLRVASGRSCKLGHGGYTGSGPGFGBPRDSGA 
EVPSGSGRATGCERGGVRGARQGRAPGSS IWRKEPRMVCTRKTK 
TLVSTCVILSGMTNI I CLLYVGWVTNYIASVYVRGQEPAPDKKL 
EE D KGDTLK 1 1 ERLDHLENVI KOW TORZi t> a vn vt? n t? n r» r> « 

LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6514 " 


2 


756 


FVS PE PGFS JjAOLNL I WQLTDTKQliVHSFABGQDQGSAyANRTA 
LFPDLIiAQGNASIiRLQRVRVADEGSFTCFVSIRDFGSAAVSLQV 
AAP YS K PSMTLEPNKDIiR PGDTVT I TCSS YQG YP EAEV FWQDGQ 
GVPLTGNVTTSQMANEQGLFDVHS 1 1»RWLGANGT YSCLVRNP V 
LQQDAHSSVTlTPQRSPTGAVEVOVPEDPwaT vrTn&TT dpci? 

SPEPGFSLAQLNLIWQLTDTKQLVHSFAEGQDQGSAYANRTALF 
PD LLAQGNAS LRLQRVRVADEGS FTCF VS IRD FGS AAVS LQ VAA 
PYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQGV 
PLTGNVTTSQMANEQGLFDVHS I JjRWLGANGT YS CLVRNP VLQ 

QDAHSS VTITPQRSPTGAVEVQVPEDP WALVGTDATLRCSFS P 
E PG FSIiAQLNL I WQLTDTRQLVHS FTEGR 


6515 


985 


302 


YGI PGP T I SSAAEM KDLI*DIOEELR YSLATSRAKMGRRAQQESA I 

QAEWHLNGKNSS LTLTGETSSAKIiPRCRQGGWAGDS VKAS KFRR 

KASEEI EDFRLRPQSLNGSDYGGDIPI I PDLEEVQEEDFVLQVA 

APPS I Q I KRVMT YRDLDNDLMKYSAIQTLDGE I DLKLI/TKVLAP 

EHEVRERNPSWQDDVGWDWDHLFTEVSSEVLTEWDPXjQTEKEDP 
AGQARHT 


6516 


1345 


305 


v>RVG5RRRGAAVPGGCGAGSTQI^VSASASCX3AIX3SAl)MNPlVV 
yn^j^A^VL SKDRKERvHQGMVRAATVG YGILREGGSAVDAVEG I 
AWALEDDPEFNAGCGSVLNTNGEVEMDASIMDGKDLSAGAVSA 
VQClANPIKLARIiVMEKTPHCFLTDQGAAQFAAAMGVPEIPGEK 
LVTERNKKRLEKEKH2KGAQKTDCQKNLGTVGAVALDCKGNVAY 
AT3TGGI VNKMVGRVGDS PCLGAGGYADNDI GAVSTTGHGES IL 
KVNLAR LTIj FHIEQG2CTVBEAADLSLG YMKSRVKGLGGL I WS K 
rGDWVAKWTS TSMP WAAAKDGKLHFG I DPDDTT ITDL P 1 




1 


1402 ] 
3 
1 

] 


rKK^YLGgDATAAARDLRTRGLQGYCPSATARQQVLVSALQQL \ 
<GRRSBHRNENQEMPYSTNKELI LGIMVGTAGISLLLLWYHKVR 
CPGI AMKLPEFLS LGNTFNS I TLQDEI UDDQGTTV I FQE RQhQ I 
jEKIiNELLTN>!EELKEEIRFLKEAIPKLEEYIQDELGGKITVHK 
[SPQHRARKRRliPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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Amino acid segment containing signal peptide - " 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N»Asparagine, 
P^Proline, Q=: Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


"6517 j 3 




FPVPKAFNTRVEELNLDVLLQKVDHLRMSESGKSBSFEliLRDHK- 
E XFRDE I E FMWRFARAYGDM YE LSTNTQEKKH YAN I GKTLS E RA 
INRAPWGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKEHLDIA 
I KLLiPEE PFLYYLKGRYCYTVS KLS WIBKKMAATLFGKIFSSTV 

QEAI^NFLKAEELCPGYSNPNYMYLAKCYTDLEENQNALKFCNL 
AIiLLPTVTKEDKEAQ KEMQ K IMTSL KR 




1414 


GRVWGGSSSLNAMVYVRGHAEDYERWQRQGARGWDYAHCLPYFR " 
KAQGHE LGAS RYRG ADGP LRVS RGKTNHPLH CAFLEATQQAG YP 
LTEDMNGFQQEG FG WMDMT 1 HEGKR WSAACAYLHPALS RTNI»KA 
EAErLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVlLSGGAIN 
S PQ£»LMLS G IGNADDLKKLG I PWCHL PGVGQNLQDHIiE I Yl QQ 
ACTRPITLHSAQKPLRKVCIGLEWLWKFTGEGATAHLETGGFTR 
S Q PGVPH PD IQFHFL PS QVI DHGRVPTQQEAYQVHVG PMRGTS V 
GWLKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKLTREIFAQE 
ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAWDPQTRVLGVENLRWDASIMPSMVSGNLNAPTIMIA 
E KAADI I KGQPALWDKD VPV YKPRTLATQR 


6518 242 " 

6519 3 


1098 


PAWNPGSEPKTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAPP PPSTMGDAGS ERSKAPSLPPRCPCGFWGSSKTKN 
LCS KC FADFQKKQPDDDS AP STSNSQSDLFS EETTS DNNNTS IT 
TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVSLTTPT^R^i- 
GTDS QS ENEAS P VKRPRLLiENTERS EETSRS KQKSRRRC FQ CQT 

KLELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KM VKLDRK VG RS CQR I GEGCS 


6520 j 3 


1113 


BKJ&MftgPP S PV H.CJ VAAAAPTAT VS E KE P FG KLQ LSSRD P PQSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQS YGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLI* 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEXREKERRRHGL 
GGARE AGGAS RE ENGE VKP LPRDKI KDK I KERDKE KERE KK KHK 
VMNE Z KKENGEVKI LLKSGKEKPKTNI EDLQI KKVKKKKKKKHK 
ENEKRKRPKM YSKS IQTI CSGIiLTDVEDQAAKG ILNDNI KD YVG 
KNLDTKNYDS KI PENS E FP FVSLKEPRVQUNIjKRLDTLEFKOLI 
HIEKQPNGGASVrHCLQ 




1113 


bK^MAEPPSPVHCVAAAAPTA'IVSEKEPFGKl.^SSRDPPGSLS " 

AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 

WSFAPLSAAPSPSSSRSSPSFSAGTAVPSSASASLSQPGPRKLIi 

VPPTIiLHAQPHHLLIjPAAAAAASANAKSRRPKE KRE KERRRHGL 

GGAREAGGASREENGEVKPLPRDKIKDKIKERDKE3CEREKKKHK 

VMNEIKKENGEVKILLKSGKEKPKTNIEDLQIKKVKKKKKKKHK 

ENE KRKRP XMY S KS I QTI CS GLLTDVEDQAAKG I LNDNIKD Y VG 

KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
HIEHQPNGGASVIHCLQ ^ 


6522 


j 104 

1042 


1798 

: 
; 
i 
i 
i 

F 

391 h 


iUirAT^xiJrSQGELVHPKAiPlilVGAQLlHADKIiGEKVSDSTMP - 

1 KETPPKSKLAEGEEEKPEPDISSEESVSTVEEQENE 
TP PATS S EAEQPKGEPENEEKEENKS S EETKKDEKDn q kt? vt? vtt 

VKXTIPSWATLSASQIiARAQKQTPMASSPRPKMDAILTEATKAC 
FQXSGAS WAIRKYI IHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFWVQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEASYSDIRKYVSQYYPKLRVDIRPQLLKNA 
bQRAVERGQLEQI TGKGASGTFQLKKSGE KPLLGGSLME YAI LS 

^IAAMNEPKTCSTTALKKYVLEWHPGTNSNYQMHLLKKTLQKCB 
<NGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
)EDES S EEDS ED EE P P PKRRLQKKTPAKS PGKAAS VKQRGS KPA 

>KVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
.fCPATSARKE 






fKWLRPSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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1 SEQ~ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

locati or\ 
a. uua l— J. wri 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A» Alanine, C«=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G~Glycine, 
H-Histidine, I»Isoleucine, K=Lysine, 
L=»Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutaniine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknovn, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECLDYYGMLSLHRMFEWGGQLTECELELLAFLLDEAPGAAGGL j 
SRARSGLKLLLELERRGQCDESNLRLLGQLLRVLARHDLLPHLA 
RKRRRPVS PERYS YGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK | 


| 6523 
6S24 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIALAMASNFNDI VKQGYVKI H 
RSRKLGlFRRCWLVFKKASSKGPRRIiEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAIIFHDETSKTFACESELEAEEWC 
KH LCM ECLGTR LND I S LGE PDLLAAG VQREQNER FNVYLMPT PN 
LD I YGECTMQ I THEN I YLWD IHNAK VKLVMWP LS S LRR YGRDS T 
VJFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATIiAlAEQHBR 
LMLEME QKARLQTSLTE PMTLS KS I S L PRS AYWHH I TRQNS VGE 

IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE | 




2 


1097 


AS CQTRRR TAALDS GER I AG RRS P 1 ALAMASNFND I VKQG YVKI ( 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAIIFHDETSKTFACESELEAEEWC 
KHLCMECLGTRLNDISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQ I TH EN I YLWD I HNAKVXLVMW P LS SLRR YGRDST 

WFTFESGRMCDTGEGLFTFQTREGEMI YQKVHSATLAIAEQHER , 
LMLEMEQKARLQTSLTE PMTLSKS I S LPRSAYWHH ITRQNS VGE 

IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE | 


t 6525 


1 


1859 


GESPFSEBES I EFNPSS SGRSARTVSSNS FCSDDTGWPS SQS VS "I 

PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 

KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 

NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 

SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 

EQYLTPIiOXJKEVTVrhLKTKLKESERRLHERESEIVELKSQLAR 

MREDWIEEECHRVEAQLALKEARKEIKQLKQVIETMRSSLADKD 

KGIQKYFVDINIQNKKLESLLQSMEMAHSGSLRDELCLDFPCDS 

PEKSLTLNPPLDTMADGLSLEEQVTGEGADRELLVGDSIANSTD 

LFDEIVTATTTESGDLELAmSTPGANVLELLPIVMGQEEGSWV 

ERAVQTDWPYSPAISELIQSVLQKLQDPCPSSLASPDESEPDS 

MES FPESLSAL WDLTPRNPNSAILLS PVETPYANVDAEVHANR 

LMRE LD FAACVEERLDG VI PLARGGWRQY WS S S FLVDLLAVAA 

P WPT VLWAFS TQRGGTD P VYNIGALLRGCCWALHSLRR TAFR 

IKT j 


6526 

• 


2 


2034 

( 

3 


SGRAGEPEEWRGRQIIDSKETWIPFNSEDSQQLEEAYSSGKGCN'l 
GR WP TDGGRYDVHLGERMR YAVY WDELAS E VRRCTWF YKGDKD 
NK YVP Y S ES FS Q VLEET YMLAVTLDE WKKKLES PNRE 1 1 1 LHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVENISVDIHCGEP 
LQIDHLVPWHG I GPACDLR FRS I VQCVNDFRS VSLNLLQTHFK 
KAQENQQIGRVE FLPVNWHS PLHS TGVDVDLQRITLPS INRLRH 
FTNDT I LDVFF YNS PT YCQTI VDTVAS EMNR I YTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSEKGSLNIVMDQGD 
TPTLEEDLKKLQLS E FFD I FE KEKVDKE ALALCTDRDLQE I G I P 
LGPRKKILNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 
TRNGD YLD VG I GQVS VKYPRL I YKPE I FFAFGS P IGM FLTVRGL 

KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKG RKRMHLELR EGLTRMSMDL KNNLLG SLRMAWKS FTRAP Y 
PALQASETPEETEAEPESTSEKPSDVNTEETSVAVKEEVLPINV 

3MLNGGQRIDYVLQEKPIESFNEYLFALQSHLCYWESEDTVLLV 
LjKEIYQTQGIFLDQPLQ 1 


6527 J 


1 


922 < 
I 

i 


?WVPLLSRILPSDACKIYKQGINIRLDTTLIDFTDMKCQRGDLS 1 
? IFNGDAAPSES FWLDNEQKVYQRIHHEESEMETEEE VDILMS 1 
JDIYSATLSTKS ISFTRAQTGWLFREDKTERVGNFLADFYLVNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Ar= Alanine , 0=Cvsteine. D-Asnarti r A*-* -5 A c 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H»Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown , *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SLTPPPQNTITW3EYISAENGKAPHLGRELVCKESKKTFKATIA 
MS Q E FP LG I ELLLNVLB WAPFKHPNKIj REFVQM KLP FGFP VKL 
DIPVFPTITATVTFQEFRYDSFDGSIFTIPDDYKEDPSRFPDIi 


6528 


1 


1073 


LTG PAAAE PRCAADAGM KRAX»GRRKGVWLRLR KI L FCVLGL Y I A 
IPFLI KLCPG I QAKIil FLNFVRVP YFI DLKKPQDQGLNHTCN YY 
LQPEEDVTIGVWHTVPAVWWKNAQGKDQMWYEDAliASSHPIILY 
LHGNAG TRGGDHRVELY K VLS S LG YHWTFDYRG WGDSVGTP S E 
RGMTYDALHVFDWIKARSGDNPVYIWGHSLGTGVATNIjVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
rSSG I KFANDENVKH I S C PLL I LHAEDDP WPFQLGRKL YS I AA 
PARS FRDFKVQFVP FHS DLG Y RHK Y I YKS PELP R I LREFLG KS E 
PEHQH 


6529 
" 6530 


363 


2215 


rHIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF 
EGVB D ES FLKWFCGNVNEQNVLS ER ELEAFSILQKSGKP I LEGA 
alde alktckts dlktprldd kele kledevqtllklknlk i qr 
RNKCQLMASVTSHKSLRLNAKEEKATKKIjKQSQGIIiNAMITKIS 
NEI»Q ALTDE VTQLMMFFRHSNLGQGTNP LVFLSQ F SLEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPSICD 

nqeileerrlemarlqlayicaqhqlihlkasnssmkssikwae 
eslhsi,tskavdkenldakissltseimklekevtqikdrsi>pa 
wrenaqllkmpwkgdfdlqiakqdyytarqelvlnqlikqka 
sfellqls ye ielrkhrd i yrqlenlvqelsqsnmml ykqleml 

TDPSVSQQINPRNTIDTKDYSTHRLYQVLEGENKKKELFLTHGN 
LEE VAEKLiKQNI S LVQDQLAVSAQEHS FFLSKRNKDVDMLCDTL 
YQGGNQLLLSDQELTEQ FH KVE SQLNKLNHLLTD I LAD VKTKRK 

TLANNKLHQMEREFYVYFLKDEDYLKDIVENLETQSKIKAVSLE 
D 




128 


2986 


G AAHHG AI VQ VHP LLPGS S TI M I HDLCL VF P APAKA WYVS D I Q 

elyirvvdkveigktvkayvrvldlhkkpflakyfpfmdlklra 
as put l valdealdnyti tfl i rgvaig qtsltas vtnkagqr 
insapqqievfppfrlmprkvtlligatmqvtseggpqpqsnil 

FS I SNES VALVS AAGLVQGLA I GNGT VS GLVQAVD AETGKWI I 
S QDLVQ VEVLLLRAVR IRAP Z MRMRTGTQMP I YVTG I TNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEAS IRLPSQ YNFAMW 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNPE I EAEQILMS PNS Y I KLQTNRDGAAS LS YRVLDG PE KVP V 
VHVDEKGFLASGSMIGTSTIEVT AOPPPfsaKrr>TT Ti/&wtnrr« oxrcs 
YLRVSMS PVLHTQNKEALVAVPLGMTVTFTVHFHDNSGDVFIIAH 
SSVLNFATNRDDFVQIGKGPTNNTCVVRTVSVGLTLLRVWDAKH 
PGLSDFMPLPVLQAI S PELSGAMWGDVLCLATVLTS LEGLSGT 
WSSSANS I LHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKE WV 
SVPQRIMARHLHPIQTSFQEATASKVIVAVGDRSSNLRGECTPT 
QR E VI QALHPE T L I S CQSQFKPAVFD FP SQDVFT VEPQ FDTALG 
QY PCS I TMHRLTDKQR KHLS MKKTALWSAS LSS S HFS TEQ VGA 
EVPFSPGLFADQAEILLSNHYTSSEIRVFGAPEVLENLEVKSG3 
PAVLAFAKE KS FG WPS F I TYTVGVLDPAAG S QGPLSTTLTFS S P 
VTNQAI AI PVTVAFVVDRRGPGP YGASLFQHFLDS YQVMFFTLF 
ALLAGTAVMI IA YHTVCrPRDLAVPAALTPRASPGHS PHYFAAS 
SPTSPNALPPARKASPPSGLWSPAYASH 


6S31 - - 
6532 


2 


1425 

954 i 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
SLCMVITIYYDVKVRFIVRGCGQYISYRCQEKRNTYFABYWYQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
DTLSLPLPNFHAGTEPDGLDPMVTLS LNLGLS FAELRRMYLFLNS 
SGLLVLPQAGLLTPHPS 

^GPPSEVWQDSLFPEPEPGPAPQVLtGPQGPGLI^GVAPPTLj 
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f ' SEQ 
ID 
NO: 

h 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid * 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

vlill.L-ftJ.V-/ uVp«XU 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=*Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, RsrArginine, 
S^Serine, T=Threonine, V=*Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TDS TGTHIj VI/T VTN KNAHS PGLS RG S PQQP SSQ PGS P APAPS A 
QMDLEHPLQPLFGTPTSIiliKKBPPGYEEAMSQQPKQQENGS SSQ 
QMDDLFDILIQSGEISAJDFKEPPSLPGKEKPSPKTVCWSPLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLBDFLESSTGLPLLTSGHDGP 
E PLS L I DDLHS QMLSS TAI LDH P P S PMDTSELHFVP EPS STMGL 

DLADGHLDSMDWLELSSGGPVLSLAPliSTTAPSLFSTDFLDGHD 
1 LQLHWDSCL 


6533 


1798 


373 


STISWIiARVEPPRRSSGVGAARLRFPGGSRPLRARACVIiAllAVL 
ALLERNNADSMS AHSMIiCER I AIAKE Ij I KRAES L S RS RKGG I EG 
GAKLCS KLKAELKFLQ KVEAGKVAI XESHLQSTNLTHLRAI VES 
AENLEEWSVLHVFGYTDTLGEKQTLWDWANGGHTWVKAIGR 
KAEALHNIWLGRGQyGDKSXIEQAEDFLQASHQQPVOYSNPHII 
FAFYNS VS S P MAE KLKEMG I S VRG D I VAVNALLDHPEE LQ PS ES 
ESDDEGPELLQVTRVDRENI LAS VAFPTBIKVDVCKR VNLDI TT 
LITYVSALSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQSILDTLGGPGERERATVLI KR INWPDQPS 
ERALRL VASS KI NS R S LT I FGTGDTLKAI TMTANS G F VRAANNQ 
GVKFSVFIHQPRALTESKEAIATPLPKDYTTDSEH 


6534 


47 


596 


KATRF ISAAFWIiNKQGVS PAKLPHTS WSWSLQTLS FLFSGDLA 
EKSLQCFPCSAMLLELI PLLG IHFVLRTARAQS VTQPDIH IT VS 
EGASLELRCN YS YGATP YLFWMERTVE E AF I LLVCLK P WRVAS S 

LEKKEKEDESFQIiLLGSRYNVLKAHCLLPLIRWLTSGDSLLSAO 
PHCPQGL 


1 6535 


250 


964 


liktffrdvaiqrdllpkbknletllti^fleidkafssHarLs^ 

ADATLLTSGTTATVALLRDGIELWASVGDSRAILCRKGKPMKL 
TIDHTPERKDEKERI KKCGGFVAWNSLGQPHVNGRLAMTRS IGD 
LDLKTSGVT AEPETKR J KLHHADDS F.L VLTTDG INFM VNSQE I W 
D FVNQCH D PNEAAHAVTEQAI QYGTEDNSTAVWPFGAWG KYKN 
SEINFS FS RSFASSGRWA 


6536 


242 


1174 


SLVKEMTNQYGILFKQEQAHDDAIWSVAWGTNKKENSETVVTGS 
LDDLVKVWKWRDERIiDLQWSLEGHQLGVVSVDISHTLPIAASSS 
IiDAHIRLWDLENGKQ IKS I D AG P VD A WTLAFS PDS Q Y LATGTHV 

GKVNIFGVESGKKEYSLDTRGKFILSIAYSPDGKYLASGAIDGI 
INI FD IATGKLLHTLEGHAMP I RS LTFS PDSQLLVTASDDG YI K 
lYDVQHANIiAGTLSGHASWVLNVAFCPDDTHFVSSSSDKSVKVW 
D VGTRTC VHTFFDHQDQVWG VKYNGNGS K I VS VGDDQE IH I YDC 
PI 


6537 


1638 


921 i 


NRFNPPPTQGPDPSLVYRPDVDPE VAKDKAS FRNYTSGPLLDRV 
FTTYKX^THQTVDFVRSKHAQFGGFSYKKMTVMEAVDLLDGLV 
DE S DPDVDFPNS FHAFQTAEG I RKAHPDKOW FHLVGLLHDLGKV 
LALFGEPQWAWGDTFPVGCRPQASWFCDSTFQDNPDLQDPRY 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AE AVPAGDTLS PQSTCTR 


6538 
J 6539 


3345 
218 


2412 

M 

339 | 


P YLYDFLDAIj ITCQTAPEEAF I KLDGLAGMIjTEQIjRRLTKQVQE 
ARHNRDDEAIKKAVNEYDETMEKYXPVIiMAOAKIYWNLENYPMV 
EKIFRKSVEFCNDHDVWKriNVAHVLFMQBNKYKEAIGFYEPIVK 
KHYDNILNVSAIVJLANLCVSYIMTSQNEKAEEIiMRKIEKEEEQL 
S YDDPNRKM YHLCI VNLVIGTLYCAKGNYBFGISRVI KS LEPYN 
KKI^TDTWYYAKRCFLSIiLEICMSKHMIVIHDSVIQECVQFLGHC 
BLYGTN I PAVI EQPLEEERMHVG1CNTVTDESRQLKAL I YE I IG W 

FLGAASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV 1 


6540 


3 


391 ] 
1 


jERLWLLLLRRPEX>AMAECPTLGEAVTDHPDRLWAWEKFVYLDE 
<QHAWLPLT I E I KDR LQIiRVLLRRED WLGRPMTPTQ I G P S LL P 
C MWQLYPDGRYRSS DS S FWRL VYH I K I DGVBDMKLEUt P DD 
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SEQ 

i ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyateine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P- Proline, Q=Glutamine, R=Arginine, 
S=: Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6541 


1165 


536 


RTLVQRR I IjMIjIjRKPARGRDLRGRGRGTPRGGRKGLLPTPDE FP 
R FEGGRKPDS WDGNREPGPGHEHFRDTPR PDHPPHDGHSPAS RE 
RSSSLQGMDMASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
S RGGRSGS NWGRGSNMNSG PP RRGASRGGGRGR 


£542 


3 


3775 


S WPRGRGE TGGHPGALRTRTMQKS VRYNBGHALYLAFIARKEGT 
KRGFLS KKTAEAS R WHEKWFAL YQNVLF Y FEGEQS CRPAGM YIiI* 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
IiRCEEEQDGKEWMEAI HQAS YADILIERBVLMQKY 1HLVQ I VET 
BKIAANQLRHQLEDQDTEI BRLKSEI IALNKTKERMRPYQSNQE 
DEDPDIKK I KKVQSFMRGWLCRRKWKTI VQDYICSPHAESMRKR 
NQ I VFTMVE AES EYVHQLY ILVNGFLRPLRMAASS KKPP I SHDD 
VSS IFLNSE TXMFLHE I FHQGLKAR I ANWPTLILADLFDI LLPM 
LNIYQEFVRNHQYSI^VLANCKQNRDFDKLLKQYEANPACEGRM 
ItETFLTYPMFQI PRYI tTLHELLAHTPHEHVERKSLEFAKSKUB 
ELSRVMHDEVSDTEN IRKNLAI ERM I VEGCD I LLDTSQTFIRQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGGKLHLLKTGGVLSLIDCTLIEEPDlASDDDSKGSGQVFGHXiDF 
KI WEPPDRAAFTWLLAPSRQEKAAWMSDI S QCVDN I RCNGLM 
TI VFEENSKVTVPHMI KSDARLHKDDTDICFS KTLNSCKVPQIR 
YAS VERLLERLTDLRFLS I DFLNTFLHTYRI FTTAAWLGKLSD 
I YKRPPTS IP VRSLELFFATSQNNRGEHLVDGKS PRLCRKFSSP 
PPIAVSRTSSPVRARKI^SLTSPI^NSKIGALDLTTSSSPTTTTQS 
PAAS PPPHTGQI PLDLSRGLSSPEQSPGTVEENVDNPRVDLCNK 
LKRS IQKAVLESAPADRAGVESSPAADTTELSPCRSPSTPRHLR 
YRQ PGGQTADNAHCSVSPASAFAIATAAAGHGS PPGFNNTERTC 
DKEFI IRRTATNRVLNVLRHWVSKHAQDFELNNELKMNVLNLLE 
EVLRDPDLLPQE RKAAAN I LMALS Q DDQDD I HLKLED 1 1 QMTDC 
MKAECFESLSAMELAEQITLLDHVIFRSIPYEEFI/3QGWMKLDK 
NERTPYIMKTSQHFNDMSNIiVASQIMNYADVSSRANAIEKWVAV 
ADI CRCLHN YNGVIiE I TS ALNRSAI YR L»KKT WAKVS KQTKALMD 
KLQKTVSSEGRFKNLRETLKNCNPPAVPYIjGMYLTDLAFIEEGT 
PNFTEEGLVNFSKMRMISHIIREIRQFQQTSYRIDHQPKVAQYL 
LDKDLI IDEDTLYELSLKIEPRLPA 


6543 


1857 


950 


FVSGCGRAG IGLS WAMAAEAR VSRWYFGGLAS CGAACCTHPLDL 
LKVHLQTQQEVKLRMTGMALRWRTDGI LALYSGLSASLCRQMT 
YSLTRFAIYETVRDRVAKGSQGPLPFHEKVLLGSVSGLAGGFVG 
TPADLVNVRMQITOVKLPQGQRRNYAHALDGLYRVAREEGLRRiF 
S GATMAS S RGALVTVGQLS C YDQAKQL VLSTG YLSDNI FTHFVA 
S FI AGGCATFLCQPLDVLKTRLMNSKGBYQGVFHCAVETAIOwGP 
LAFYKGLVPAGI RLIPHTVLTFVFLEQLRKNFG I KVPS 


6544 


630 


79 


PS P CF I RSRLDGQP WMAGLEAWLSQNFS LHQPQSRVRVRRAS I S 
EPSDTDPEPRTLNPSPAGWFVQQHPBIiELMSSFRERFGRNWLQY 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
u r Wc. a tfy KM£> tu E VKAE PQEEEEEKEGKEEKEEGEMAPL PEAHLG 
EGKQKECP 


6545 


176 


. 560 


P PHSHAAIiLPAAMTPLLTIil LVVLMGLPIAQAIiDCHVCAYNGDN 
C FN PMR C P AMVAYCMTTRT Y YTPTRMKVS KS CVPRC FETVYDG Y 
SKHASTTSCCQYDIjCNGTGLATPATLALAPILLATLWGLL 


6546 

i 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKCNS " 
SPGVLKVIiAQLGLGFSCANKAEMEIjVQHIGIPASKI icanpcxq 
IAQIKYAAKHGIQLLSFDNEMELAKWKSHPSAKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHLLENAKKHHVEWGVSFHIGSGCPD 
PQAYAQS I ADARLVFEMGTELGHKMHVU3LGGGFPGTEGAKVR F 
EE I AS VIN S ALDLY FPEGCGVD I FAELGR Y YVTS AFTVAVS 1 1 A 
KKEVLLDQPGRJEEENGSTS KTIVYHLDEGV YGI FNSVLFDNICP 
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SEQ 
ID 
NO j 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino ar>^ 

residue of 
ami. no ao-!H 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, e=» 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H^Histidine, I = I sol eu cine, K=Lysine, 
L=Leucine, M=»Methionine, N=Asparagine, 
F^Proline, Q=Glutamine, R-Arginine, 
SsSerine, T=Threonine, V-Valine, 
W=»Tryptophan, Y-Tyrooine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








TPILQKKPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDW 
LVFDNMGAYTVGMGSPFWGTQACHXTYAMSRVAWEALRRQLMAA 
EQEDDVEGVCKPLSCGWEITDTLCVGPVFTPASIM 


6547 


1 


541 


^HSKYLAPALCSQPGMMRCCRRRCCCKQPPHALRPLLIiLPLVLL 

PPLAAAAAGPNRCDTIYQGFAECLIRIiGDSMGRGGELETrCRSW 

NDFHACASQVLSGCPEEAAAW7ESLQQEARQAPRPNNLHTLCGA 

PVHVRERGTGSBTNQETLRATAPALPMAPAPPLLAAALALAYLI, 
RPIA 


6548 


2 


219 


fvsri>svrdvrfptflgghgadamhtdpdVsaayvpietdaedg 

IKGCGITFTLGKGTEVGELKIL5RFQNA 


6549 


73 


1490 


etgrvcedarpacgsrsrrrrkeaapgiptpspssssptssrpa" 

ARAFSKAPARLSRPRAREEPPDPGRRYIQBE I IQARKHKLI KMC 

ssvaaklwfltdrriredypqkeilralkakcceeeldfrawm 
dewltieqgnlglringelitaypqwwrvptpwvqsdsdit 
vlrhlekmgcrlmnrpqailncvnkfwtfqelaghgvplpdtfs 

YGGH3NFAKM IDEAEVJLEFPMWKNTRGHRGKAVFIiARDiGiHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCS LGG VGMM CS LSEQGKQLAXQVSNIhGMDVCG I DIiL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGIIADYAASLLPSG 
RLTRRMSLLSWSTASETSEPELGPPASTAVDNMSASSSSVDSD 
PBSTERELLTKIiPGGLFNMNQLLANE I KLLVD 


6550 


2293 


922 


FRVSRDGAPDCGIEQMGl^AMEHGGSYARAGGSSRGCWYYLRYFF^ 

LFVSLIQFLIILGLVLFMVYGNVHVSTESNLQATERRAEGLYSQ 

LLGLTASQSNLTKBI^FTTRAKDAIMQMWLNARRDLDRINASFR 

QCQGDRVIYTNNQRYMAAIILSBKQCRDQFKDMNKSCDALLFML 

NQKVKTLEVEIAKEKTICTKDKBSVLLm^VAEEQLVECVKTRE 

LQHQERQIAKEQLQKVQALCLPLDICDKFEMDLRNLWRDSIIPRS 

LDNLGYNLYHPLGSEIASIRRACDHMPSLMSSKVEEIARSLRAD 

IERVARENSDLQRQKLEAQQGLRASQEAKQKVE KEAQAREAKLQ 

AECSRQTQIALEEKAVLRKERDNLAJCELEEKKREAEQLRMELAI 

RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRKI 

LESQRPPAGI PVAPSSG 


6551 
6552 


157 


748 


IQPPUi'RNMTIAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDIjNWCVISDMEVIELNKCTSGQSFEVILKPPSFDGVPEFN 
ASLPRRRDPSLEEIQKKIaEAAEERRiCYOEAELLKHLAEKREHER 
EVIQKAIEENIWFIN'IAKEKLAQKMESNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 




157 


74 8 


igPPDPRNMTIAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWCVISDMEVIBLNKCTSGQSFEVILKPPSFDGVPEFN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER 
EVIQKAIEENNNFI KMAKEKIiAQKMESNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6553 " 


2 


1807 

] 
I 


FVWS KMAAHIjS YGR VNLNVIiR EAVRRELREFLD JKCAGS KAI VWD ~ 

EYLTGPFGLIAQYSLLKEHEVEKMFTLKGNRLPAADVKNIIFFV 
RPRIiE L MD I IAENVLS EDRRG P TRDFH T "LT?\7 PPR Q t t r»co»r wt> 

LGVLGSFIHREBYSLDLIPFDGDbLSMESEGAFKECYLEGDQTS 
L YHAA KGLMTLQAL YG TT PQ I FGKGE CARQ VANMM I RMKRE FTG 
SONS I FPVFDNLHiLDRNVDLLTPIATQLTYEGLIDEI YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEBLYAEIRDKN 
?>IAVGSVLSKKAKI 1 SAAFEERHNAKTVGE r KQFVSQLPHMQAA 

RGSIjANHTSIAEIjIKDVTTSEDFFDKLTVEQEFMSGIDTDKVNN 
if I E DC I AQ KHSL I KVT*RLVCLQSVCNS GLKQKVLD Y YKRE I LQT 
KG YEH I LTLHNL E KAGLLKPQTGGRNNYP T IRKTLRLWMDD VNE 
2NPTD I S YVYSGYAPLS VRIiAQLLSRPGWRS IEEVIjR I LPGPHF 

serqplptglqkkrqpgenrvtli fflggvtfaeiaalrflsql 

2DGGTB YV lATTKLMNGTS W I EALMEKPF 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=* Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S:=Serine, T=Threonine, V^Valine, 
WaTryptophan, Y=Tyrosine, X« Unknown, + =»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVBSGALHWIVGGGFGGIAAASQLQALNVPFMLVDM 
KD S FHHNV AALRAS VETG FAKKT F IS YS VTF KDNFRQGL WG I D 
LKNQMVLLQGGBALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDM VRQVQRSRF I VVVGGGSAGVEMAAE I KTEYPEKEVTLIH 
SQVALADKELLPSVRQEVKEILLRKGVQLLLSERVSNLEELPLN 
E YRE Y I KVQTDKGTE VATNLVI LCTG I KINS S AYRKAFE S R LAS 
SGALRVNEIILQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 
ANIVNSVKQRPljQAYKPGALTFLLSMGRNDGVGQISGFYVGRLM 
VRLTKSRDLFVSTSWKTMRQSPP | 


6555 


1552 


498 


IHMALLRK1NQVLLFLLIVTLCVILYKKVHKGTVPKNDADDESE 
TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQ PLNFVR F YLPLL IHQHE KVI YLDDD VI VQGDI QELYDTTTiA 
LGHAAAFSDDCDLPSAQDINRLVGLQNTYMGYLDYRKKAIKDLG 
I S P STCS FNPG VI VANMTE WKHQR I TKQLEKWMQKNVE ENLYS S 
SLGGGVATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKLLHWNGRHKPWDPPSVHNDLWESWFVPDPAGIFKLNHHS I 


6556 


241 


1449 


ASLCKGCFFVTHVLVI ILPSLQ^PPyFG^FLI^IDGVLVRGHRVI 
PAALKAFRRLVNS 0X3QLR VP VVFVTNAGNILQHS KAQELS ALLG 
CE VDADQVI LS HS PMKLFS E YHEKRMLVSGQGP VMENAQGLGFR 
NWTVDELRMAFPLLDMVDLERRLKTTPLPRNDFPRIEGVLLLG 
EPVRWE TSLQL IMD VLLSNG S PGAGLAT PP Y PHLPVLASNMDLIj 
WMAE AKMPR FGHGTFLLCLET I YQKVTGKELR YEGLMG K PS I LT 
YO YAE DLI RRQAE RRGWAAP I R KL YAVGDNPMS DVYGANL FHQ Y 
LQKATHDGAPELGAGGTRQQQPSASQSCISILVCTGVYNPRNPQ 
^EP^GGGEPPFHGHRDLCFSPGLMEASHVVNDVNEAVQLVFR 




2596 


1534 


RMCGRTSCRLPRDVXiTRACAYQDRRGQQRLPEWRDPDKYCPSYN 
KSPQSNS PVLLSRLHFEKDADSSERI IAPMRWGLVPSWFKESDP 
SKLQFNTTNCRSDTVMEKRS FKVPLGKGRRC WLADGFYKWQRC 
QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLLT 
MAG I FDCWE P PEGGD VLYS YT 1 1 T VDS CKGLSD IHHRMPAILDG 
EEAVS KWLDFGEVSTQEALKL I HPTENI TFHAVSS WNNSRNNT 
PECLAPVDLWKKELRASGSSQRMLQWLATKSPKKEDSKTPQKB 

ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ j 


| 6558 


21 




FHGRRRGGRKMELGSCLEGGREAASEBGEPEVKKRRLLCVEFAS 
VASCDAAVAQCFLAENDWEMERALNS YFE PPVEESAL ERRPET I 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENGSMFSLITWNID 
GLDLNNLS ERARG VCS YLALYS PDVI FLQEVIPPYYS YLKKRS S 

NYEIITGHEEGYFTAIMLKKSRVKLKSQEIIPFPSTKMMRNLLC 
VHV^IVSGNELCLMTSHI,ESTRGHAAERMNQLKMVI i KKMQEAPES 
ATVIFAGDTNLRDREVTRCGGLPNNIVDVWEFIiGKPKHCQYTWD 
TQMNSNLG I TAACKLR FDR I FFRAAAEEGH 1 1 PRSLDLLGLEKL 
DCGRFPSDHWGLLCNLDIIL | 


j 6559 
65*0 - 


3 


364 


GPEI^GLPTRyKKLKANQTPIAMDCCASRSCSVPTGPATTICSS 
DKSCRCGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPCVPTCF 
LLN SCQ P T PGLETLNLTT FTQ P CCEPCLPRGC ! 




3 


1435 

1 
C 

1 


TATSGG I WLRKKWRCHWPRPLPQS CVGTEGGLQ VRDTS SRIAKG I 
3VDHTKMSLHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
IRNVTSPTRQHHVEREKDHSSSRPSSPRPQKASPNGSISSAGNS 
5RNSSQSSSDGSCKTAGEMVFVYENAKEGARNIRTSERVTLIVD 
STRFWDPS I FTAQ PNTMLGRMFG SGREHNFTRPNE KGE YE VAE 
3IGSTVFRAILDYYKTGXIRCPDGISIPELREACDYLCISFEYS 
rl KCRDLSALMHELSNDGARRQFEF YLEEMILPLMVASAQSGER 
:CHIWIiTDDDWDWDEEYPPQMGEEYSQIIYSTKLYRFFKYIE 
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! SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


v»w**uc* itiiriy signax peptide 
<A=Alanine, C^Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HsHistidine, I=l3oleucine, K=Lysine, 
L=>Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W=Tryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








NRDVAKSVLKERGLKKIRLGIEGYPTYKEXVKKRPGGRPBVIYN 
YVQRPFIRMSWEKEEGKSRHVDPQCVKSKSITNLAAAAAD1PQD 
QLWiMHPTPQVDBLDILPIHPPSGNSDLDPDAQNPMIj 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLL^PEPSPSWPSHS 
P CPMAALTDLS FM YRW FKNCNLVGNLS E KYVF ITG CD SG FGNLL 
AKQLVDRGMQVLAACFTEEGSQKLQRDTSYRIjQTTLLDVTKSES 
IKAAAQVTVRDKVGEQGLWALVNNAGVGLPSGPNEWIiTKDDFVKV 
INVNIiVGL12VTI,HMbPMVKRARGRVVlIMSSSGGRVAVIGGG v C 
VSKFGVEAFSDSIRRELYYFGVKVCIIEPGNYRTAILGKENLES 
RMRKLWERLPQETRDSYGEDYFRIYTDKIiKNIMQVAKPRVRDVI 
NSMEHAIVSRSPRIRYNPGLDAKLLYIPIAKLPTPVTDFILSRY 


6562 


1 


1562 


MSTLYDIRAHKAQLIJtFFASSDSNKALBQRRTi^HTPKLEHLDRV 
LYEWFLGKRSEGVPVSGPMLIEKAKDFYEQMQLTBPCVFSGGWL 
WRFKARHGIKKI^ASSEKQSADHQAAEQFCAFFRSIjAAEHGLSA 
EQ V YNADETGL FWRCLPNPTPEGGAVPGP KQG KDRIiTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGlQHIiPVAYKAQGNAWVDKElFS 
DW FHH1 F VPS VREHFRT IGLFEDS KAVIiL L»DS SRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFIKPPVPLQGPHAR 
YNMNDAI FS VACAWNAVPSHVFRRAMRKLWPSVAFAEGSSSEEE 
LEAECFPVKPHNKSFAHILELiVKEGSSCPGQLRQRQAASWGVAG 
REAEGGR PPAATS PAEWWSS BKTPKADQDGRGDPGEGEEVAWE 
Uk/wat AJMVLtK^iUs^UPCFSAQF/vGyLRALRAVFRSQQQVRRRR 
GALiGAWKVEAIjQEGPGGCGATAQS PLP CSSTAGDN 


j 6563 


1319 ~ 


2694 


IiARPAQPVLLREPEGAGPPVPAGHLVHHLQXSGHLRERAHPDEFX" 
HEHPt»P CDQMFWRQMGGHLrRMVE ANS RGWWG I G YDHTAWVYTG 
GYGGGCFQGIiASSTSNIYTQSDVKCVHIYBNQRWNPVTGYTSRG 
LPTDRYMWSDASGLQECTKAGTKPPSLQWAV7VSDWFVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 
EVP P IALRDVS 1 1 PESPGAEGSGHS I ALWAVS DKGD VLCRLGVS 
*j mw o n Lm, v \? x u\^fr /\t> ibi GACYQ VWAVARDGSAF YRGS V 
Y PSQ PAGD CW YH I PS P PRQRLKQVSAG QTS VYALDENGNL W YRQ 
GITPSYPQGSSWSHVSNNVCRVSVGPLDQVWVIANKVQGSHSriS 
RGTVCHRTGVQPHE PKGHGWD YGIGGGWDH I S VRANATRAPRSS 
SQEQEPSAPPEAHGPVCC 


6564 
65*5 " 


1 


975 


APGS(^LWSYCGRGWSRAMRGCQIiGLRSSWPGDLLSARLLSQE 
KRAAETHFG FETVS EEE KGGKVYQ VFES VAKKYDVMNDMMS LG I 
HR VWKD LLL W KMH PLPGTQLLiD VAG G TGD IAFRFLNYVQ S QHQR 
KQKRQLRAO^NLSWEEIAKEYQNEEDSIiGGSRVVVCDINKEMriK 
VG KQ KAIiAOjG YRAGIiAW VIjGDAEEL P FDDDKFD I YTIA FG I RNV 
TH I DQALQEAHR VLKPGGRFUCLE FS QVNNP L I S RLYDLYS FQ V 
I P VLG E VI AGDWKS YQ YLVES I RR FP SQEEF KDMI EDAGFH KVT 
YESLTSGIVAIHSGFKL 


6566 


1464 


999 


RSAVANGLTKRRMGL KLKGR Y I SL I LAVQ I AYI.VQ A VRAAG KCD 
AVFKGFSDCLLKLGDSMA^PC^3LDDKTNIKTVCTYWEDFHSCT 
VTALTDCQBGAKDMWDKLRKESKNLNXQGSLFELCGSGNGAAGS 
LLPA FP VLLVSLS AAIATWLS F 




3 


1385 ■ - 

] 
J 
( 


KYKSAQPGGTQPEPGLGARMAIHKALVMCIX3LPLFLFP<^WAQG~" 
HVPPGCSQGLNPLYYNLCDRSGAWGIVLEAVAGAGIVrTFVLTI 
ILVASLPFVQDTKKRSLLGTQVFFIJ*GTIX3I,FCLVFACVEKPDF 
STCASRRFLFGVLFAICFSCLAAHVFALNFLARKNHGPRGVrVIF 
rVALLliTL VE VI INTEWLI ITLVRGSGEGGPQGNSSAGWAVAS P 
-AIANMDFVMAL I YVMLLIxiGAFJJGAWPALCGR Y1CRWRXHG VFV 
LLTTATS VAI WWWI VM YTYGNKQHNS PTWDDPTLAI ALAANAW 
\F VLF YVI PE VSQVTKS SPEQS YQGDM YPTRGVG YET I LKEQ KG 
JSMFVENKAFSMDEPVAAKRPVSPYSGYNGQLI»TSVYQPTEMAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to £irst 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

cor re s pond i ng 

to first 

dill J. iiVj dC X d 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A°Alanine, CoCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F» Phenyl a 1 anine , G=Glycine, 
K=Histidine, I=»Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S -Serine, r=» Threonine , v« Valine, 
W^Tryptophan, Y =3 Tyrosine , X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








MHKVPSEGAYDIILPRATANSQVMGSANSTLRAEDMYSAQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567- 


125 


863 


TKRSNLKAYACSIHHIRTMSYVFVKDSSQTNVPLLO^CIDGDFN 
YS KRLLES GFD PN I R DS RGRTGLHIiAAARGNVD I OQLLH KFGAD 
LLATD YQGNTALHLCGHVDTIQFLVSNGLKI D I CNHQGATPLiVIj 
AKRRGVNKDVIRLLESLEEQEVKGFNRGTHSKLETMQTAESESA 
MBSHS LLNPNLQQGEGVLSS ERTTWQE FVEDIiGFWRVLLLI FVI 
ALLS LG I AY YVSGVLP FVENQPELVH 


6558 


3 


1183 


HASDRLLVLPDNYSHFSQASANLQGPSRTTELFHPTLAS ISSPM 
LEGAE L YFNVDHGYXiEGLVRGCKASLLTQQD Y INLVQCE TLE DL 
KIHLQTTDYGNFLANHTNPLTVSKIDTEMRKRLCGEFEYFRNHS 
LE PLS TFLTYMTCS YMI DNVILLMNGALQKKS VKEILGKCHPLG 
RFTEMEAVNIAETPSDLFNAILIETPLAPFFQDCMSEKALDELN 
lELLRNKLYKSYIiEAFYKFCKNHGDVTAEVMCPILEFEADRRAF 
I ITLNSFGTBLSKEDRBTLY PTFGKIj YPEGLRLIAQAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVFYEREVQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKIN3YIPIL 


6569 


205 


1532 


RRRG P QRLGHGRPT PLL CRWRTAG P SHWBKQARAFQGLR P VD PR 
RMS WL FPLTKSAS S S AAGS PGGLTS LQQQKQRL I ES L.RNS H SS I 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVY VTS PLVNNFTMHSDLGKI 1QSLLDEFWKNP P VLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVS SSTTSHTTAKPAAPS FG VLSNLPLPIPTVDAS I PTSQNG 
FGYKMPDVPDAFPELSELSVSQLTDMNEQEEVLLEQFLTLPQLK 
QI ITDKDDLVKS IEEIARKNLLLEPSLEAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSES CSAS ALQARliKVAAHBAEEESDNIAED 
FLEGKMErDDFLSSFMEKRTICHCRRAKEEKLQQAIAMHSQFHA 
PL 


6570 


330 


1304 


ARLPRLT FLREG F L YVLLSHWVFVGAP R P PAS DS WKKGL VPS AP 
PASRKMGS KALPAPI PLHPSLQLTNYS FLQAVNTFPATVDHLQG 
LYGLSAVQTMHMNHWTLGYPNVHEITRSTITEMAAAQGI.VDAR? 
PFPALPFTTHLFHPKQGAIAHVLPALHKDRPRFDFANLAVAATQ 
EDPPKMGDLSKLSPGLGSPISGLSKLTPDRKPSRGRLPSKTKKE 
F I CKFCGRHFTKS YNLLIHERTHTDERPYTCDI CHKAFRRQDHI* 
RDHRYIHS KEKP FKCQECGKGFCQSRTIiAVHKTLHMQTSSPTAA 
SS AAKCSGETVI CGGT 


■ SS71 


169 


656 


APDMKRKKLQKLTDTLTKNCKHLFRGFDKDNDGCVNVLEWIHGL 
SLFLRGSLEEKMKYCFEVFDLNGDGFISKEBMFHMLKNSLLKQP 
S E EDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCLPDPKSQMEFEAQVFKDPNEFNDM 


6572 






T P ERAQ PGALLGAAG CCVCGGRW W PRSHERG YFS S AKMGSKRRN ' 
LS CS E RHQ K LVDBNYCKKLHVQALKNVNSQ I RNQM VQN3NDNRV 
QRKQFLRLLQNEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELAKLKHESLKDEKMRQQVRENS I ELRELE KKLKAAYMNKERAA 
Q I AEKDAI KYEQMKRDAEIAKTMMEEHKRI I KEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED 
QLEKQQKLBKMNAMRRYIBEFQKEQALWRKKKREEMEEENRKII 
EFANMQQOREEDRMAKVQENEEKRLQLQNALTQKLEEMLRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEQMA 
LKELVLQAAKEEEENFR KTMLAKFAEDDR I ELMNAQ KQRMKQLE 
HRRAVE KL I EERRQQFIiADKQRELEEWQLQ QRRQGF I NA I IE EE 
RLKLLKEHATNLLG YL PKG VFKKE DDI DLLGEE FRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGGGESQSFRAQDGTRTPATDCLMYLQGPRKLMTQGGYDMVQK 
LFLD FFRRRLS QRP TAEE LEQRNI LKPRNEQE EQEEKRE I KRRL 
TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C~Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glyc*ine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w«=Tryptophan, Y=Tyrosine, X«Unknown. *«Stop 
Codon, /=poseible nucleotide deletion, 
\«possible nucleotide insertion) 


6574 
" 6575" 


204 


1159 


LTAADKVSRGECWRVGGRTVCWVSLGSPLGSV ~~ 

bESSVPVSVGVFWACGVSWTGAAGLQDGALSDTMARNAEKAMTA 
IARFRQAOLEEGKVKERRPFIASECTELPKAEKWRRQIIGEISK 
KVAQIQNAGLGE PRI RDLNDE INKLLREKGHWEVRI KELGGPDY 
GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 
PRKTRABLMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 
KWICAEREARLARGEKEEEEEEEEEINIYAVTEEESDEEGSQEKG 
GDDSQQ K F I AHVP VPS QQE I EE ALVRRKKMEL LQKYAS E TLQAQ 


6576 


117 


820 


iFAlaASQSGGITEElCMLEPQENGVIDLPDVEHVEDBTFPPFPpP~~ 
AS P ERQ DGEGTEPDEE S GNGAP VPVPP KRTVKRN I P KLDAQRL I 

SERGLPALRHVFDKAKFKGKGHSAEDLKMLIRHMEHWAHRliFPK 
LQ FEDF I DR VE YLG S KKE VQTCLKRI RLDLP I LHEDF VSNNDE V 

AENNEHDVTSTELDPFLTNLSESEMFASELSISLTEEQQQRIER 
NKQLALERRQAKLP 


6577 


1 


1060 


P E PQALVG Q KKGALRL L VARL VLTVS APAE VRRR VLR P VCSWMD | 

RETRALADSHFRGLGVDVPGVGQAPGRVAFVSEPGAFSYADFVR 

G FLL PNLPC VFSSAFTQG WGS RRRWVTPAGR PDFDHLLRT YGDV 

WP VANCGVQE YN S NPKEHMTURD YI T Y WKE Y I QAG YS S PRGCL 

YLKDWHLCRDFPVEDVFTLPVYFSSDWLNEFWDAliDVDDYRFVY 

AGPAGSWSPFHADIFRSFSWSVNVCGRKKWLLFPPGGEEALRDR 

HGNLPyDVTSPALCDTHLHPRNQLAGPPLEITQEAGEMVFVPSG 

WHHQVHNLVMCCFSCPLSGAFLQEDGSTTSPIiSQPELGWNGVAH 
G | 


6578 


2271 


987 


aDRr^SDDFuiviij^MLEAPYKKEEDEQQPJCEVKKDYPSNTTSS 1 
TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
R EKS P VRE P VDNLSPEERDART VFCMQIiAAR I RPRDLEDFFS AV 
GKVRD VRI 1 S DRNSR RS KG IAYVE FCEIQS VPLAI GLTGQR LLG 
VPI I VQASQAEKNRIiAAMANNLQKGWGGPMRL YVGSLHFNI TED 
MLRG I FEPFG K I DNI VLMKDS DTGRS KG YGF I TFS DS ECARRAI, 
EQLNGFELAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAG IQLPSTAAAAAAAAAAQAAALQLNGAVPIiGA 
LNPAALTALSPALNLASQCLQLSSLFTPQTM | 


" 6579 " 


377 


1489 


PSSSATMNRaf ^kratilhmaltgasdpsaeaeangekpfllra I 

IjQIALVVSLYWVTSlSMVFLNKYLLDSPSIiRLDTPIFVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDIiRVARSVLPLSWFIG 
MITFNWLCIiKYVGVAFYNVGRS lttvfmvllsylllkqtts FYA 
LLTCGI I IGGFWLGVDQEGAEGTLSWLGTVFGVLASLCVSLNAI 
YTTKVLP AVD G S I WRLTFYNNVNAC I IiFL PLLLL3W3ELQALR DF 
AQLGSAHFWGMMTLGGLFGFAIGYVTGLQIKFTSPLTHNVSGTA 
KACAQTVLAVLYYEETKSFLWWTSNMMVLGGSSAYtWVRGWEMK 
KTPEEPSPKDSEKSAMGV | 


6580 


2 


711 

I 


RPPRVWYPEJUKELSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT " 

I ymgkdkyenedli khgwpediwfh vdklssahvylrlhkgeni 

EDIPKEVI^DCAHLVKANSIQGCKMI^^ 

dvgqigfhrqkdvkivwekkvneili^lektkverfpdlaaek 
scrdreernekkaqiqemkkreiceemkickremdelrsysslmkv 

3NMSSNQDGNDSDEFM 




62 


1571 ] 
I 
I 

; 
c 


jVAIiKNWKPKUTNI papqspvfgeavsgvywmtkvlgmapvlgp 
ippqeqvgplmvkveekeekgkylpslemfrqrfrqfgyhdtpg 
>realsqlrvlccewlrpeihtkeqilellvleqfltilpqelq 
iwvqehcpesaeeavtlledlereldepghqvstppneqkpvwe 

CrsSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
IDPRKVRDCRLSTQHEESADEQKGSEAEGLKGDI I SVI I ANKPE 
.SLERQCVNLENEKGTKPPLQEAGSKKGRESVPTKPTPGERRYI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A»Alanine, C=Cysteine, D=-Aspartic Acid, E= 
Glut ami c An' r? u*— Dh Anir] »1 i r*i * . j _ 

uj.uL.anuk, — l u. , r -rnenyxaianine ( G=Glyclne, 
H^Histidine, I=»Isoleucine, K=Lysine, 

li—LeUCine . MssM^^hH r\r\ i r»o M— A c-r^ ^ ■» . ^ -! 

^ — / m-j reinioinne, JN=Asparagme, 
P= Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«=possible nucleotide insertion) 








vouwftrtt t^inooruj imkk ixiiuaKir X VvrTKCGKAFSHSSNXiTIi 
H YRTHL VDR P YD CKOG KA FGQS S DLLKHQRMHTE E AP YQCKDCG 
KAFSGKGS LIRH YR XHTGEKP YQCNE CGKS PS QHAGLS SHQRLH 
TGEKP YKCKECGKAPNHS SNFNKHHR I HTGEKPY WCHHCGKT F C 


6581 


228 


4 76 


RVFIiKDLSSTPMASNNTAS I AQARKLVEQLKMEAN IDRI KVS KA 
AADLMAYCEAHAKEDPIjLTPVPASENPFREKKFFCAIL 


6582 


1428 




CFTTKTHCSPVSVPYIiSPLVLRKEtiESLLENEGDQVIHTSSFlN 
QHPI I FWTL V W Y FRRLDLPS NL PGL I LTS EHCNEG VQL P LSS LS 
QDSKLVYIQLLWDNINLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVETIRQS1QHNNVLKPINLLSQQMKPGMKRQRSLYKE 
ILFLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILERLQKIDAP 
PSASVEWCRKCFGAPLI 


6583 
^ 6584 ~ 


487 


41 


RIFSMTSGRLRWRCTWRPATALWSASLRU5TSSMHPSPRSISLP 
LSMMLSPLPSNTRGIiSPTALFRSPDSEHATSCPRUILWRCRAPL 
RS PS PIXSRLQVLPRS P LHVHTHNSG KEVLGLQVQRS RS GTGPAC 
SQAGSGAVQGGNWCI F 




189 


1750 


PX,PMAALGFSSQIWTEYVWVPKNTTKKYNIMAFNAADKVNFAT~ 
WNQARLERDLSNKKIYQEEEMPES GAGS EFNRKLRE EAR RKKYG 
I VLKEFRPEDQP WLIjRVNGKSGRKFKGI KKGG VTENTS Y Y I FTQ 
CPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVXiNH 
FSIMQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKiOCKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEE3PKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGtfSRPGTPSAE 
GGSTSSTXRAAASKLEQGKRVSEMPAAKRLRLDTGPQSDSGKST 
PQ P PSG KTTPN S GDVQVTEDAVRR YIjTRKPMTTKDLIjKK FQTKK 
TGLS S EQT VNVIiAQI L KRLNPERKM I NDKMHFS LKE 


6585 


3 


1678 


GPIRNSRIDDFVGGDPRAEASCS vlhs kphamadsrdpasdqmq 
HWKBQRAAQ KADVF i TTGAGNP VGD KLNVI TVGPRG PLLVQD WF 
TDEMAH FDRE RIPE R WHAKGAGAFG YFEVTHD I TKYS KAKVFE 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTPIFFIRDPILFPSFIHSQKRNPQTHLKDPDMVWDFWSLR 
P ES LHQVS FLFSDRG I PDGHRHMNG YGSHTFKL VNANGEAVYCK 
* "* -*■ ^\£^j ■*■ ivii v . 0 ^"**^iJoyriiyJrJJXvXRDiiFI4AIAXGKYPSW 
TFYIQVMTFKQAETFPFNPFDIjTKVWPHKDYPLIPVGKLVLNRN 
PVNYFAEVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDTHRHR 
LGPNYLHIPVNCPYRARVANYQRDGPMCMQDNQGGAPNYYPNSF 

gapeqqpsalehs iqysgevrrfntanddnvtqvrafyvnvlne 
eqrkrlceniaghlkdaqifiqkkavknftevhpdygshiqall 

DKYNAEKPKNAIHTFVQSGSHIiAAREKANL 


6586 
Sf*87 


32 


804 


PliPEQPAESTSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL 
NLGKKISVPRDVMLEELSLLTNRGSKMFKLRQMRVEKFIYENHP 
DVFSDSSMDHFQKFLPTVGGQLGTAGQGFSYSKSNGRGGSQAGG 
S GSAGQ YGS DQQHHLGSG SGAGGTGGPAGQAGRGGAAGTAG VGE 
TGSGDQAGGEGKHIWFKTYISPWERAMGVDPQQKMELGIDLIiA 
YGAKAELP KYKS FNRTAMP YGG YEKASKRMTFQMPKV 




75 


1117 

] 


RRVPSLGKMPECVJDGBHDIETPYGLLHWIRGSPKGNRPAILTY 
HDVGLNHKLCFNTF FNFEDMQE ITKHF WCHVDAPGQQVGASQF 
PQGYQFPS MEQLAAMLPS WQHFGFKYVI GIGVGAGAYVLAKFA 
LIFPDLVEGLVLVNIDPNGKGWIDWAATKLSGLTSTIiPDTVLSH 
LFSQEELVNNTELVQSYRQQIGNWNQANLQLFWNMYNSRRDLD 
rNRPGTVPNAKTLRCPVMLWGDNAPAEDGWECNSKLDPTTTT 
?LKMADSGGLPQVTQ?GKLTEAFKYFI*QGMGYMPSASMTRIiARS 
ITASLTSASSVDGSRPQACTHSESSEGIjGQVNHTMEVSC 
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SSQ 
ID 

MO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H»Hietidine, I«Isoleucine, K*=Lysine, 
L=. Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W« Tryptophan, Y=Tyrosine, X=0n*nown, *~Stop 
Codon, /^possible nucleotide deletion, 
\*»possible nucleotide insertion) 



-sor 



6589 



6590 



2177 



6591 



2177 



6592 



6593 



1405 



~65T 



"6S6" 



1861 



1837 



LQLQAQLLBijRTNNYQLSDELRKNGVE LTSLRQKVAYLDKEFSK 
AQKALSKS KKAQE VE VLLS ENEMLQAKLHS QEBD FRLQNSTLMA 
EFSKLCSQMEQLEQSNQQLKEGAAGAGVAQAGP 



RPWGSAMATKSRQEFFQQLtQGCLLPTAQOGLDQlWLLLAlCIA 
CRLLWRLGLPSYLKHASWAGGFFSLYBFFQLHMVWVVLLSLLC 
YLVL FLCRHS SHRGVFLS VTI LI YLLMGEMHMVDTVTWHKMRGA 
QM I VAMKAVS LGFDLDRGE VGTVPSP VE FMG YL YF VGTI VFGP W 
ISFHSYIiQAVQGRPLSCRWLQKVARSLALALLCLVLSTCVGPYL 
F P YF I PLNGDRLLRNKKRKARGTMVRWLRAYE S AVS FH FSN Y F V 
GFLSEATATLAGAGFTEEKDHLEWDIiTVSKPLNVELPRSMVEVV 
TSWNLPKSYWLNNYVFKNALRLGTFSAVLVTYAASALLHGFSFH 
LAAVLLS LAF I TYVEHVLRKRLAR I LS AC VLS KRCP PDCSHQHR 
LGLGVRALNLL FGALAI FHLAYLGSL FDVDVDDTTEEQGY3MAY 
TVHKWSELS WASHWVTFGCWI FYRLIG 



VRAYEHVLS LLENVFTPMFCHRDEYFRQLLRGAES PTRNS KLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFI EEGI WMEDDS PVEAVSTPNTPRNLAAWKI S I PY 
VDFFEDPS S E RKEKKERI P VFCI D VERNDRJ^AVGHE PEHWS V YR 
RYLEFYVLBSKLTEFHGAFPDAQLPSKR1IGPKNYEFLKSKREE 
FQE YLQKLLQHPELSNSQLLADFLS PNGGETQFLDKI LPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
EYLMYVGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDUJVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VIQELFPE LNKVQKE VTS VTSWM 



VRAYEHVLai,i,ENVFTPMFCHRDEY FRQtLRGAESPTRNSKLNR 
GS LS LDDFRNTQKRGES FGI SR I GS Kl KGVFKS TTMEGAMIi PNY 

GVAEGEDDFIEEGIWMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSSERKEKKBRIPVFCIDVERNDRRAVGHEPBHWSVYR 
RYLEFYVLES KLTE FHGAFPDAQLPS KR I IGPKNYEFLKS KREE 
FQEYLQKLLQHPELSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTS ENNKKLFWDLFKNNANRABNTERKQNQN YFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVS LI TLLRDAI FCEN TE PRS LQDKQKGAKQTFEEM 
MN YI PDLLVKC IGEETKYES IRLL FDGLQQPVLNKQLTYVLLDI 
VIQE L FP ELNKVQKEVTS VTSWM 



APE FLGST 1 SS GS M I DANL KLLQ BAE QRLKAI VAEKFAI ATKBG 
DLPQVERFFKIFPLLGLHEEGLRKFSEYLCKQVASKAEENLLMV 
LGTDMSDRRAAVI FADTLTLLFEG I ARI VETHQP I VETYYGPGR 
L YTL I KYLQVE CDRQ VEKWDKF I KQRDYHQ QFRHVQNNLMRNS 
TTEKI E PRELDP I LTEVTLMNARS EL YLRFLKKR I S S DFE VGDS 
MAS E E VKQEHQKCLDKLLNNCLLS CTMQELIGLYVTM EE YFMRE 
T VNKAVALDT YEKGQLTSS M VDD VF Y I VKKC IGRALS SS S I DCL 
CAM INLATTEL ESDFRDVLCNKLRMG F PATT FQD IQRGVTS AVN 
IMHSSLQQGKFDTKGlESTDEAKMSFLVTUflNVEVCSENlSTLK 
KTLES DCT KLFS QG I GGEQAQAKFDS CLSDLAAVSNKFRDLLQE 
GLTELNSTAIXPQVQPWI NS FFSVSHNI EEEEFNDYEANDPWVQ 
QFILNLEQQMAE FKASLS P VI YDSLTGLMTSLVAVELEKWLKS 
TFNRLGGLQFDKELRSLIAYLTTVTTWTIRDKFARLSQMATILN 

LERVTEILDYWGPNSGPLTWRLTPAEVRQVLALRIDFRSEDIKR 
LRL 



EAFSAGSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR" 

RGGEGGGGRGRGDKRRRRQARRQRRRPEPAEARGGKMADVLSVL 

RQYNIQK)G2IVVKGDEVIFGEFSWPKNVKTNYVVWGTGKEGQPR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=JUanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, FoPhenylalanine, G=Glycine, 
HwHistidine, X-Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
SaSerine, T=Threonine, V-Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EYYTLDSILFLLNlSn/HLSHPVYVRRAATENIPWRRPDRKDLLG 
YLNGEASTSASIDRSAPLEIGLQRSTQVKJtAADEVLAEAKKPRI 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAK IMAKKRSTI KTDLDDDI TALKQRS FVDAE VD VTRD I VS RE 
RVWRTRTTILQSTGKNFSKNI fai lqs vkareegrapeqrpapn 
aa p vd ptlrtkqp 1 paaynr ydqe rfkgkeeteg fk i dtmgt yh 
gmtlksvtegasarktqtpaaqpvprpvsqarpppnqxkgsrtp 

III IPAATTSLITMLNAKDLLQDLKFVPSDEKKKQGCQRENETL 
I qrrkdqmqpggtai s VTVP YRWDQPLKLMPQDKDR wavfvq 
GPAWQFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 
D VT VLEL S YHKRHLDRP VFLR VWE TLDR YM V KHKS HLR F 


6594 


1 


1096 


EFPGRRFRGSQASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
D I L S T IG YDN I I QHLNNGRKNCKE FE D FL KERAAI E ER YGKDLL 
NL S R KKP CGQS E I NTLKRALE VFKQQVDNVAQCHIQLAQS LRE E 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
Q KCRDKDEAEQAVS RS ANLVNP KQQEKLFVKLATS KTAVEDSD K 
AYMLHIGTLDKVREEWQSEHIKACEAFEAQECERINFFRNALWL 
H VNQLSQQCVTS DEM YEQVRKSLEMCS IQRD I EYFVNQRKTGQI 
P PAP I MYENF YS SQKNAVPAGKATG PNLARRGPLPI PKS S PDDP 
f NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKIIiRDWIjYLH 
RYNAYPSEQEKLSLSGQTNLSVI^ICNWFINARRRLLPDMLRKD 
GKDPNQFTI SRRGGKASDVALPRGSS PSVLAVS VPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTLTLLTRAEA 
GS PTGGLFNTPPPTPPEQDKEDFS SFQLLVEVALQRAAEMELQK 
QQDPSLPLLHTPIPLVSENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VL IQKS QMTE PGPDVKKKTE EEDVECEDDL I IiA CQ PES S VKALD 
FD I S E TRTEVE VEEL PP I DHG I P I TDRRSTFQAHLAPWCP KQV 
KMVLS KLYENKKIASATHNI YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME ILNVKNVMVVVSRW YGGI LLGPDR FKHINNCARN 
I L VE KNYTNS PEES SKALG KNKK VRKDKKRNEH ' 


6597 


2 


1026 


PRLPVRRYHGRRRLC^RSRGHMAEGDAGSDQRQNEEIEAMAAIY 
(jc.cvih.vj. uuk-RKi PLIRX5 uux UD P KWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VLIQKSQMTEPGPDVKKKTEEBDVECEDDLIIACQPESSVKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPVVCPKQV 
KMVLS KLYENKKIASATHNI YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LIWKNVMVVVS RW YGG I LLGPDRFKH INN CARN 
ILVEKNYTNSPEESS KALGKNKKVRKDKKRNEH 


6598 


1099 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLSFCRLHKQSSMTVMEAQESPLFNNVTCLQRKLPVESIQIVLEE 
LRKKGNLEWLDKSKSSFLIMWRRPEEWGKLIYQWVSRSGQNWSV 
FTLYE LTNGEDTEDEEFHGLDEATLLRALQALQQEHKAE 1 1 TVS 

DGPRRQVLLAGTCLPLLLTSHLS RAFKRRQTQCP PKTGSVTPPD 
SKGLQS 


6599 ™" 


164 


1593 


KMAALTTLFKYIDENQDRYIKKLAKWVAIQSVSAWPEKRGEIRR" 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILLGRLGS 
D PQKKT VC I YGHLDVQPAAL EDG WDS EP FTLVERDGKLHGRGS T 
DDKGPVAGWINALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
L I FARKDTFFKD VD YVC ISDNYWLGKKKP C I TYGLRG I C YFF I E 
VECSNKDLKSGVYGGS VHEAMTDL I LLMGS LVDKRGN I L I PGIN 
EAVAAVTREEHKLYDDIDFDIEEFAKDVGAQILLHSHKKDILMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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WO 01/53312 



PCT/US0O/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


r A CUJLUCCU w-ilwl 

nucleotide 
location 
co r re spon d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D~Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Gly C ine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrasine, X=Unknown, *=Stop 
^.uuutt, / -pubbiuxe nucj.eoc2.cte aexeciori, 
\»possible nucleotide insertion) 








WGEQVTSYLTKKFAELRSPNEFKVYMGHGGXPWVSDFSHPHYL 
AGRRAMKTVFGVEPDLTREGGS IPVTLTFQEATGKNVMLLPVGS 
ADDGAHSQNEKIUNRYNYIEGTKMLAAYLYBVSQLKD 


6600 


2 


934 


PGRLFRVAAMESAGLEQLLRELLLPDTKRIRRATEQLQIVLRAP 
AALSALCDLLASAADPQ I RQFAAVLTRRRLNTRWRR LAAEQR E S 
LKSLILTALQRETEHCVSLSLAQLSATIPRKEGLEAWPQLLQLL 
QHSTHSPHSPEREMGLLLLSVWTSRPEAFQPHHRELLRLLNET 
LGEVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMLVPKLIMAMQ 
TLI P IDEAKACEAIiEALDErJCiESEVPVITPYLSEVXjTFCLEVAR 
NVALGNAI R I R I LCCLTFLVKVKS KALLKNR LLAT LAAHP FPHC 
GC 


6601 


529 


1420 


PRAAARAP P PAVLRRDRRAATAPGAGEMTLHGPLAQRYFLNHIE 
KI TTWQDPRKAMNQPLNHMNLH PAVSSTPVPQRSMAVSQPNLVM 
NHQHQQQMA PS TLSQQNHPTQN P P AGLMSM PNALTTQQQQQQKL 
RLQRIQMERERIRMRQEELMRQEAALCRQLPMEAETLAPVQAAV 
NP PTMTPDMRS ITNNSSDPFLNGGP YHSREQSTDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGO/rPMNINPQQTRFPDFLDCLPGT 
NVDLGTLES EDL I PL FNDVES ALNKSE PFLTWL 


6602 


127 


617 


LLDFPALPKFVLAQSPKAGKPSTMTSMTQSLRE VI KAKTKARNF 
ER VLG K I TL VS AAPGKVI CEMKVEEEHTNA I GTLHGGLTATLVD 
NI STt4ALLCTEROAPGVS VDMNIT YMS PAKLGEDI VITAHVLKQ 
GKTLAFTSVDLTNKATGKLIAQGRHTKHLGN 


6603 


79 


660 


PVGPSSLAARTGI^HLPFLHRIASSRGLDMDLLQFtiAFbFVLLL 
SGMGATGTLRTSLDPSLEIYKKMFEVKRREQLIJ\LKNLAQLNDI 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPPPQDEXLKD 
AFSHWENTAF FGD WLRFPRI VHYYFDHNSN WNLLIRWG I S FC 
NQTGVFNQGPHSPILSLM 


6604 


3 


eon 

BOO 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERVVLLGEFL 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEIFGQJJR 
DFYFSVKLSENMKASSFKKLQKFYIDPYKLLPLQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


848 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDRDLDDVLEKAKKANVV ' 
AIiVAVAEHSGEFEKIMQLSERYNGFVLPCLGVHPVQGIiPPEDQR 
SVTLKDliDVALPIIENYKDRLLAIGEVGIiDFSPRFAGTGEQKEE 
QRQVliIRQIQLA2CRLNLPVNVHSRSAGRPTINLLQEQGAEKVIiI> 
HAFDGRPSVAMEGVRAGYFFS I PPS 1 1 RSGQQKLVKQLPLTS I C 
LETDS PALGPEKQVRNEP WNI S ISAE YIAQVKGISVfiE VI E VTT 
QNALKL FPKLRHliLQK 


6606 


2 


1682 


c vaAnrjuw, v/un us ahssas f xytWWLKRLSLLEDIV YRQLNGLS 
KSLGLIEGYGGRGKGGLPATLSPAEEEKAKGPHEKYGYNSYLSE 
K I SLDR S I PDYRPTKCKELKYS KDLFQ I S I I FI FVNEALS V I LR 
5VHSAVNHTPTHLLKEIILVDDNSDEEELKVPLEEYVHKRYPGL 
VKWRNQKREGL I RAR I EG WKVATGQVTGF FDAHVE FTAGWAE P 
VLS R I Q EN R KRVI LPS I DN X KQDNFEVQRYENSAHG YS WELWCM 
YrSPPKDWWDAGDPSLPIRTPAMlGCSFVVNRKFFGEIGLLDPG 
MDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAEVWMDDYKS HVY I AWNLPLENPG ID I GDVS E R 
RALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQG PLENHTAI L Y P CHGWGPQLAR YT KEGFLHLG ALGTTTLL P 
DTRCLVDNSKSRLPQLLDCDKVKSSLYKRWNFIQNGAIMNKGTG 
RCLEVENRGLAGIDLI LRSCTGQRWTIKNS I K 


6607 


137 


98* m 


VPACAGLKKEARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
GI S FGGRGG AGPG VPTRTQVFAAMGAVMGTFS SLQTKQRRPS KD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRBLQVLYRGFKNECP 



529 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


J Predicted end 

nucleotide 

location 

corresponding 

to first 
1 amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 

(AaAlanine O^fv.c: -J -n** n-acnart-i p j\ _ ■ j 

' 1 * ,s » \*—*~y o uci-ije , u-rtspar tic AglCl r E=s 

Glutamic Acid, F=Phenylalanine, G*=Glycine, 

H=Histidine, I=lsoleucine, K~Lysine, 

L«Leucine, M^Methionine, N^Asparagine , 

P- Proline, Q^Glutamine, R»Arginine, 

S=Serine, T=Threonine, V= Valine, 

W=Tryptophan, Y-Tyroeine, X«Unknown, *=Stop 

CO don i /oQOaiihlp micl ^nh < ^ol h J 

^* w w» » i f w .a, *j u. UUV.1CUL1UC os-Lcmon, 
\=possible nucleotide insertion) 








SG WNEDT FKQI YAQFFPHGDAST YAHlf LiFNAFDTTQTGS VKFE 
DFVTALS I LLRGTVHEKLRWTFNL YD I NKDGY INQBEMMD I VXA 

SCQEDDNIMRSLQLFQNVM 


6608 


224 


1140 


RPCFSSPTGIiCPRIiSYPMIIiDQHAVIiPPPKQPSPSPPMSVATRS i 
TGTLQLPPQKPFGQEASLPIiAGEEEZiSKGGEQDCALEEIjCKPIiY 
CKIiCNVTLNS AQQAQAHYQG KNHGKKL RNYYAANS CP PPARMSN 
v vfir/irti tr v v v f fUFHjo r JvFUvsKVI LiATENDYCKLCDASFSSP 
AVAQAH YQGKNHAKRLRLAEAQSNS FS ESSELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGP YFNPRSRQR I PRDLAMC VTPSGQ F YC 
SMCNVGAGEEME FRQHLES KQHKS KVS EQRYRNEMENLG YV | 


6609 


1 


i 443 


FRLRCRRFRVAGGRIAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAEPGNFQLS PAEPRGPLASPVRAAPRAPCPAAEMSELNTKTS 
f a i « yAAQ?y £E KGKAGNV KKAEEEE E I DI DLTAPE TEKAALAI Q 
GKFRRFQKRKKDPSS | 


6610 


319 


881 


GRKS LCNLH I F I RFPLT YPDM YMGMM CT AKKCG I R FQPP A 1 1 LI 
YESE IKGKIRQR I MPVRNFS KFS DCTRAAEQLKNN PRH KS YLEQ 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKI* 
DDKBLAKRKS IMDELFEKNQKKKDDPNFVYD1 E VEFPQDDQLQS 

\jK3rlLf X XltOrtJUlSr I 


6611 


978 


212 


PGCSGAGSRVW WIiPALRHLAMGSTESSEGRRVS FGVDEEE RVRV 
LQG VR LS ENW1MRMKE P S S P P PAPTS S TFGLiQDGNI»RAPHKES T 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHBEQKSVRLARELESREAELRRRDTFYK 

QAQ ILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKG | 


6612 


1724 


992 


VSTHASALSRTQGQPQROPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDS PSTSGGSSDGDQRESVQQEPEREQ VQPKKKEGKI 
SSKTAAKIjSTSAKRIQKEIiAElTJjDPPPNCSAGPKGDNriYEWRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVI CLDILKDNWSPALTI S KVLLS I CSLLTDCNPADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT \ 


6613 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSLVljRFVKGTFRESYI | 
*• * *Avviou;i\.3iuiiJUi iUl i vao/lUc PAMQRIjSISKGHA 1 
FILVYSITSRQSLEELKPIYEQICEIKGDV3SIPIMLVGNKCDE 
S PSRE VQSSEAEALARTWKCAFMETS AKLNHNVKELFQELLNLE 
KRRTVS LQI DGKKSKQQKRKEKLKGKCVI M | 


6614 


3 j 


1191 


SSAAEAMRVLVRRCWGPPLAHGAPJIGRPSPQWRAIJVRXjGWEDCR 
DSRVREK?PWRVIiFFGTDQFAREALRALHAARENKEEELIDKI*E 
WTMPSPSPKGLPVKQYAVQSQLPVYEWPDVGSGEYDVGWASF 

GRLLNEALI UCFP YG T T ,WVRP rT .D» UP fSDHDUt uttjt ur- r\r™ w 1 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELBAVLSRLGAN 
MLISVLKNLPESLSNGRQQPMEGATYAPKISAGTSCIKWEEQTS 
EQIFRLYRAIGNI IPLQTLWMANTIKLIjDIjVEVNSSVIxADPKLT 
GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NGYLHPWYQKNSQAQPSQCRFQTLRIiPTKKKQKKTVAMQQCIE | 


6615 


B32 " 1 


35 


GRVGAGASAMSELPGDVRAFIjREHPSLRLQTDARKVRCILTGHE J 
LPCRLPELQVYTRGKKYQRLVRASPAFDYAEFEPHIVPSTKNPH 
Qh FCKLTLRH I NKCPEHVLRHTQGRR YQ RALCKYEECQKQG VE Y 
VPACLVHRRRRRBDQMDGDGPRPREAFWEPTSSDEGGAASDDSM 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 
EGRRETTV YRGLVQKRGKKQLGSLKKKFKSHHRKPKS FSSCKQS 
G ] 


6616 


347 


1886 


LLPPCQGARFIaSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLSTPSICRS | 



530 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firBt 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R«Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFXNLPGPIWLQPSPP'PQSSP 
PPQPHPCHTCRGLVDSPNKGLERTIRDNPGGGNTAWEEENLSKY 
KDSETRLVEVLEGVCSKSDFECHRLLELSEELVESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAG YGGEACGQCGLG Y FEAERNAS HLVCSACF 
GPCARCSGPSESNCLQCKKGWALHHIiKCVDIDECGTEGANCGAD 
QFCVNTEGSYECRDCAKACIjGCMGAGPGRCKKCSPGYQQVGSKC 
LDVDECETEVCPGENKQCENTEGGYRCICAEGYKQMEGICVKEQ 
I PESAGFFSEMTEDELWLQQMFFGI 1 1 CALATLAAKGDLVFTA 
I F I GAVAAMTGY WLS ERSDRVLEG F XKGR 


6617 
6fJl8" 


118 


673 


VWMAWQ VS LLEL EDRLQ CP I CLEVFKESLMLQCGHS YCKGCLVS 
IjS YHLDTKVRCPMC WQAVDGS SSLPNV3LAWVT ETLLRTiPrsriPP p 
KVCVHHRNPLSLFCE KDQEL I CGLCGLLGS HQHHPVTP I STVCS 

RMKEELAALFSEIiKQEQKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 




548 


136 


DG KVARRAPNS PAFQND I Y PLVS APRATTAES P WS KVLQNTQCR 

NVPKMTS ERS RIP CLS AAAAEGTGKKQQEG RAMATLDRKVP S PE 

AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PASS E VLTAAVMFliLLNCXVAVSQNMG I GKNGDLPRPPLRNEFR 
YFQRMTTTSSVEGKQNLVIMGRKTWFSIPEKNRPLKDR1NLVLS 
RELKEPPQGAHFLARSIiDDALKIiTERPELANKVDMIWIVGGSSV 
YKEAMNHLGHLKLFVTRIMQDFESDTFFSEIDLEKYKLIiPEYPG 
ILSDVQEGKHIKYKFEVCEKDD 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQESALGAYSPVDYMSITSFPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 

MGSQDGSPLRETRKDPF , ^AAAAFr , ^r , Pr"iTV* , T •nrxTnr»nr»r friw*»»n^ 

VTVALVMQIYFGDPQIFQQGAWTDAARCTSLGIEVLSKQGSSV 
DAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHMDFRES 
APGALREETIiQRSWETKPGLLVGVPGMVKGLHEAHQLYGRLPWS 
OVI1AFAAAVAQDGFNVTHDI1ARALAEQLPPNMSERFRETFLPSG 
RPPLPGSLLHRPDLAEVLDVLGTSGPAAFYAGGNLTLEMVAEAQ 
HAGGVITEEDFSNYSAliVBKPVCGVYRGHliVLSPPPPHTGPALI 
SALN I LEG FNLTSLVS REQAIiH WVAETLKl ALALASRLGD P VYD 
STITESMDDMLSKVEAAYLRGHINDSQAAPAPLLPVYELDGAPT 
AAQ VL I MGPDD F I VAMVSS LNQ P FG SGI* I TP SG I LLNS QMDDFS 
WPWRTANHSAPSLENSVQPGKRPLSFLIiPTVVRPAEGLCGTYLA 

LGANGAARGLSGLTQVRFTPWLAFFSREPSCGLDCRCLSYLWLV 
SIPHAANMG 


6621 


1 


662 


VQG I TS YQQRLQAIiRKE KSRDAARS RRGKENFE FYELAIQjIjPL P 
AAITSQLDKASIIRLTISYLKMRDFANQGDPPWNLRMEGPPPNT 
SVKVIGAQRRRSPS ALA I EVFEAHLGSH iLQSIiDGYVFAIjNQEG 
KFL Y ISETVS £ YLGLS Q VELTGS S VFD YVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 
6623 


2 


319 


UKASGAQEETEAGGPERARAMKANMPKRKEPGRSLRIKVISMGN" 
AE VGKS CI I KR YCEKRFV5 KYLATIG ID YGVTKVHVRDRE I KVN 
I F DMAGH P F F YE VRKP F 




1886 


189 


KALFEKVKKFRLHVEEGDILYAMYVRQTVLKVIKFLIIIAYNSA " 
LVSKVQFTVDCNVDIQDMrGYKNFSCNHTMAHIiFSKLSFCYLCF 
VS I YGLTCL YTL YWLF YRS LRE YS FE YVR QETGFDDI PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNIiNNEWTPDKL 
RQKLQTNAHNRLELPLIMIiSGLPDTVFEITELQSLKLE I IKNVM 
I PATIAQLDNLQELSLHQCSVKIHSAALSFLKENLKVLS VKFDD 
MRELPPWMYGLRNLEELYLVGSLSHDISRNVTLESLRDLKStJCI 
DSIKSWSKlPQAVVDVSSHLQKMCIHNDGTKLV>IC^LKKr^N 
t^E^^LVHCDLERIPHAVFSLLSLQELDLKENNLKSIEEIVSFQ | 
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SEQ 
ID 
NO: 


Predicted 
beginning 

X J U C JL cOLlQC 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticJe - 
(A^Alanine, C=Cysteine, D»Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine f R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
"'Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKLTSLERLSFSHNKIEVLPSH 
LFLCNKIRYLDLSYKDIRFIPPBIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI 
L PPELGD CRALKRAGL WEDALFETLPS DVREQMKTE 




218 


1786 


ya KKGGGS R I P AVS TH VAPGRS VLRP FAS GALRLRS LVKALGG C 
RGRPSGLAHLSQETSHWRAKRSGRACLGDFPGEILRSFIMKCTA 
REWLRVTTVLFMARAIPAMWPNATLLEKLLEKYMDEDGEWWIA 
KQRGKRAITDNDMQSILDLHNKLRSQVYPTASNMEYMTWDVELE 
RSAESWAESCLWEHGPASI*LPSIGQNJU3AHWGRYRPPTFHVQSW 
YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
A INLCHNMN I WGQ I WP KAVYLVCNY S PKGNWWGHAP YKHGRPCS 
ACPPSFGGGCRENLCYKEGSDRYYPPREEETNEIERQQSQVHDT 
HVRTRSDDS S RNEVI S AQQMSQ I VS CE VRLRDQ CKGTTCNR YE C 
PAG CLDS KAKV I GS VH YEMQS 3 1 CRAAIH YG 1 1 DNDGG WVD ITR 
QGR KH YFI KSNRNG I QTIGK YQS ANS FTVS KVTVQAVTCETTVE 
QLC PFHKPASHCPR VY CPR KL YAS KS TLCS CNWNS SLF j 


6625 


1124 


543 


PG PRGGGGS LLSTKALGRS RGLGMH PG PS S GGTEGG VP TALR PP 
GPLVPSTSDDNLLKNIELFDKLALRFHGRLLFLKDVLGDEICCW 
SFYGQGRKIAEVCCTSIVYATE1CK0TKVBFPEARIFEETLNILI 
YE T PRGPDPALLEATGGAAGAGGAGRGEDE ENREHRVRR I HVRR 
HITHDERPHGQQIVFKD 


6626 


3 


1498 


SAVEFVYTDRFHblLGISVEFLCSLRSDATMESITACLHALQAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQI ICAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQILLED 
GSRL VS AAL VI LS EL PA VCS PEGS I S IL P TIL YLTIGVLRETAV 

KLPGGQLSSTVAASLQALKGI LSS PMARAEKSRTAWTDLLRS AL 
TTI LDCWDPVDETHQELDEVSLLTAI TVFI LS TS PEVTT I PCLQ 
KR C I DKFKATLE I KDPWQ I KTYQLLHS I FQ Y PNP AVS Y P Y I YS 
LASCIMEKLQE IDKRKPENTAELEI FQEGIKVLETLVTVAEEHH 
RAQLVACLLPILI SFLLDENSLGSATS IMRNLHDFALQNLMQIG 
PQ YSS VFKSLVAS S PALKARLEAAI KGNQES VKVKI PTS KYTKS 
PGKtfSSIQLKTSFL 


6627 


1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL 
GDTGVGKTCFLIQFKDGAFI^GTFIATVGIDFRNKVVTVDGVRV 
KLQ I WDTAGQE R FRS VTHA Y YRDAQALLLL YD I TNKSS FDN I RA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFLETSAKTGP4NVELAFLAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 

• 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN ' 

KEFGDSLSLEILQIIKBSQQQHGLRHGDFQRYRGYCSRRQRRLR 

KTLNF KMGNRHKFTGKKVTEELLTDIOI YLIiLVLMDAERAWS YAM 

QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 

LE AQAYTAYLSGMLRFEHQE WKAAI EAFNKCKT I YEKLASAFTE 

EQAVLYNGRVEEISPNTR YPAYT\iTr:rjnc:2vTMirT mhmdt nnnnmn 
•* * , ** , w"»o£iii3riMi.Ri , «-^»*"xojjU"'"-iwjsjj| , 7yfifcLRSGGXE 

GLLAE KLEAL I TQTRAKQAATMSE VE WRGRTVP VKIDKVR I PLL 

GLADNEAA1VQAESEETKERLFBSMLSECRDAIQWREELKPDQ 

KQRDYILEGEPGKVSNLQYLHSYLTYIKLSTAIKRNENMAKGLQ 

RALLQQQPEDDSKRSPRPQDLIRLYDIILQNLVELLQLPGLEED 

KAFQKE I GLKTL VFKA YRC F F I AQS YVLVKKWSEALVL YDRVLK 

YANEVNSDAGAFKNSLKDL PDVQE L I TQVRSE KCS LQAAAI LDA 

NDAHQTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 

GFQPIPCKPLFFDLALNHVAFPPLEDiCLEQKTKSGLTGYIKGIF 
3 FRS 


6629 


5653 " 


4549 < 

1 


3ATPLGS VGGRTGKMDAATLTYDTLRFAEFEDFPETSE PVW I LG 
^KYS I FT EKDE I LS D VAS RLWFT YRKNF PAI GGTGPTS DTGWG C 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 

t* t-\ -F 4 v* r- *- 

to Eirsc 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
h- Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine / 
S=Serine, T=Threonine , VaValine, 
"^Tryptophan, Y^Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLRCGQM I FAQALVCRHLGRDWRWTQRKRQPDS YFS VIiNAFI DR 
KDSYYSIHQIAQMGVGEGKSIGQWYGPNTVAQVLKKLAVFDTWS 
o .utt vnx wivri i v Vrin.rn kh.1jL.ki o VPCAGATAFPADSDRHCNGF 
PAGAEVTNRPS P WRPLVLLI PLRLGLTDINEAYVETLKHCFMMP 
QSLGVIGGKPNSAHYFlGYVGEELIYXiDPHTTQPAVEPTDGCFI 
PDESFHCQHPPC^MSlAEIiDPSIAWRGGHI.STQAFGAECCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRAIiYKRVLQIiHRVI,PPDLKS 
LGDQ YVKDEFR RH KTVGSDE AQRFLQ K WE V Y ATAIiLQQANE NRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 
SESMKPKF 


6631 


2 


423 


LVQCX3GIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHKTVG3DEAQRFIjQEWEVYATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQ IGQLQEIiMQEATKPNRQ FS I 
SESMKPKF 


6632 


1273 


588 


wnsrgrtqrgaapuapaaamkawqrVtrasVtvggeqisaigr 

GICVIjLGISLEDTQKELEHMVRKII^LRVFEDESGKHWSKSVMD 
KQYE I LCVSQ FTliQCVLKGNKPDFHLAMPTEQAEG FYNS FLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 

KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


" 6633 


1 145 


617 


ATGRHEGVPTLEGIIQQLVNGIITPATIPSLGPWGVJ^HSNPMDY 

awgangldaiitqllnqfentgpppadkekiqalptvpvteehv 
gsglecpvckddyalgervrqlpcnhlfhdgcivpwleqhdscp 
vcrks ltgqntatnppgltgvsfsss ssssssss psnenatsns 


6634 


1 


1134 


CXSGIPRKGSGPRRRLPMARLRDCIiPRLiMLTLRSLLFWSLVYCYC 
GLCAS I HLLKLLWSLGKGPAQTFRRPAREHP paclsdpslgthc 
YVRIKDSGLRFHYVAAGERGKPLMLIjIjHGFPEFWYSWRYQLREF 
KSEYRWALDLRGYGETDAPIHRQNYKLDCLITDIKDIIiDSLGY 

skcvl i ghdwggmiawlia i cypemvmklivinfphpwvfteyi 
t.rhpaqllkssyyyffqtpwfpefmfsindfkvlkhlftshstg 
igrkgcqlttedleayi yvfsqpgalsg pinhyrni fsclplkh 

HMVTTPT LLLWG ENOAFME VEMAE VTR F YVKN YPRLT ILS EAS H 
WLQQD QPDIVNKL I WTFLKEE TRKKD 


6635 • 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGLGPHGPSFARVPVAPSSSSG 
GRGGAEPR PLPLS YRLLDGE AALPA WFIjHGL FGS KTNFNS I AK 
I LAQQTGRRVLTVDARNHGDS PHSPDMS YEIMSQDLQDLL PQLG 
LVPCWVGHSMGGKTAMLLALqRPELVERLIAVDISPVESTGVS 
HFATYVAAMRAIWIADELPRSRARKLADEQLSSVIQDMAVRQHL 
LTNLVEVDGRFVWRVNLDALTQHLDKIIiAFPQRQESYLGPTLFI, 

lggnsqfvhpshhpeimrlfpraqmqtvpnaghwihadrpqdfi 

AAIRGFliV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 
QPI VRQCIiQRPPLCGVLGPVQQQLPPS LGP VLS PHSDPGWCRVD 
DGGDGVF 


6637 


2 ■ 


1501 


CSSS PCFHDGTCVLDKAGS YKCACLAG YTGQRCENLIjEAGKSK I 
KASBDSLSVLEERNCSDPGGPVNGYQKITGGPGL1NGRI1AKIGT 
WSFFCNNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 
LVRRRVXjPMQVQSRETPLHQLYSAAFSKQKLQSAPTKKPAIjPFG 
DLPMGYQHLHTQIiQYECISPFYRRLGSSRRTCDRTGKWSGRAPS 

cipicjgkienitapktqglrwpwqaaiyrrtsgvhdgslhkgaw 
flvcs gaxi vnert vvvaahcvtdlgkvtmi ktadl k vvlgkf yr 
dddrdektiqslqisaiilhpnydpilldadiailklldkaris 
trvqpiciiaasrdlstsfqeshitvagwnvladvrspgfkndtl 
rsgws wdsllceeqhedhg i pvs vtdnmfcas weptapsdi c 
taetgg iaavs fpgras peprwhlmglvs ws ydktcshrlstaf 
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Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid,. E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K»Lysine, 
LssLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GGIPQAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 
PNSD I DLSNLERLE KYRS FDR YRRRAEQEAQAPHW WRTYREY FG 
EKTD PKEK1 D I GLP P PKVS RTQQLLERKQAI QELRANVEEERAA 
RLRTASVPLDAVRAEWBRTCGPYHKQRLAEYYGLYRDLFHGATP 
VPRVPLHVAYAVGEDDLMPVYCGNEVTPTEAAQAPEVTYEAEEG 
SLWTLLLTSLDGHLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 
PPFPARGSGIHRLAFLLFKQDQPIDFSEDARPSPCYQLAQRTFR 
TFDFYKKHQETMTPAGLS FFQCRWDDSVTYI FHQLLDMREPVFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 


1268 


IC3CPTMDGf;DrJGMt.T IKKRFVSP2\VT*r)PPJ?KPPOPEWPKVPI£PP 

DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDE VS RQQEL I E KQRRE B E L KELK E YRNN LK KVG I SQE 
NKKEVEKKLTVKPIETKNKFSQAKLIiAGAVKHKSSESGNSVKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGILPGL 
GA YS GS S DS ES S S DS EGT I NATGK1 VSS I FRTNTF LEA P 


6640 


117 


1043 


VLEP PDVSMAESEDRSLRI VLVGKTGSGKSATANT ILGEEI FDS 
p t zi Tx. o Jv vpvn rr\ v a c p v won p dt .t .wtyt on T . P TVT* VP c t .n t^tt 1 ir c 

ISRCIISSCPGPKAIVLVLLLGRYTEEEQKTVALIKAVFGKSAM 
KHMVILFTRKEELEGQSFHDFIADADVGIiKSIVKECGNRCCAFS 
NS KKTSKAEKESQVQEIiVELIEKMVQ CNEGAYFSDDI YKDTEER 
LKQREEVLRKI YTDQLNEE I KLVEEDKHKSEEKKEKEI KLLKLK 
YDEKI KNIREEAERNIFKD VFNRI WKMLSE I WHRFLS KCKFYSS 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGIAWSS 
ASAP P PRGF S A I S CTVEGAPAS FGKS FAQKS G Y FLCLS SLGS LE 
NPQENWAD I Q I WDKS PLPLGFS P VCDPMDS KAS VS KKKRMCV 
KLLPLGATDTAVFDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APRP VPIO?RGLS RDMOGLS LDAASO P S KOGIiliERTAi? Rl/5<5 P A «? 
TLRRNDS I YEAS S L YG I S AMDGVP FTLHPR FEGKS CS PLAFS AF 
GDLTI KSLAD IEEEYNYGFWEKTAAARLPPSVS 


6*42 


22 


1296 


PLEERMMTKMDPNDQAQRD 1 1 FELRR IAFDAESDPSNAPGS GTE 
KRKAMYTKDYKMLGFTNHINPAMDFTQTPPGMIjAIjDI^LYXiAK^ 
HQDTY I RI VLENSSREDKHECPFGRSAI ELTKMLCEILQVGELP 
NEGRND YHPM FFTHDRAFEELFG I C I QLLNKTWKEMRATAEDFN 
KVMQWREQ ITRAbPSKPNSLDQFKS KLiRSLS YS E I LRLRQS ER 
MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 
r^RQERFWYCRIAIJNHKVLHYGDLDDNPQGBVTFESLQEKrPV 
ADIKAIVTGKDCPHMKEKSALKQNKEVLELAFS I LYDPDETLNF 
IAPNKYEYCIWIDGLSALLGKDMSSELTKSDLDTLLSMEMKLRL 
LDLENIQIPEAPPPIPKBPSSYDFVYHYG 


6643 


3049 


2265 


S LHAPAEGRTRGR LiAE KP KM LTRK I KL WDI NAH I TCRLCSG YLI 
DATTVTECLHTFCRSCLVKYLEENNTCPTCRIVIHQSHPMYIG 
HDRTMQDI VYKLVPGLQEAEMRKQRE FYHKLGMEVPGDIKGETC 
S AKQHLD S H RNG E TKADDS SN KEAAEEKPEEDND YHRSDEQ VS I 
CLECNSSKiRGLKJiKWIRCSAQATVIiHLKKFIAKKIjNIjSSFNEL 
DILOTEEILGKDHTLKFVVVTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


FRPLATEPRGSSPVQLVSSTMSVRTLPLLFLNLGGEMLYILDQR 
LRAQNIPGDKARKVLNDIISTMFNRKFMEELFKPQELYSKKALR 
TVYERLAHAS IMKLNQASMDKLYDLMTMAFKYQVIiLCPRPKDVL 
LVTFNHLDTI KGFIRDS PTI LQQVDETLRQLTE I YGGLS AGEFQ 
LIRQTLLI FFQDLH I RVSM FLKDKVQNNNGRFVLP VSGPVP WGT 
EVPGL IRMFNNKGEEVKRI E FKHGGNYVPAPKEGS FEFYGDRVL 
KLGTNMYS WQP VETHVSGS SKNliAS WTQES IAPNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
EVINIQATQDQQRSEELAR I MGEFE ITEQPPXSTSKGDDLLAMM 
DEL 



534 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end" 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C«=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
LsLeucine, M=Methionine, N=Asparagine , 
P=»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 


6645 


6530 


4646 


FVEGLAGYVYKAASEGKVLTI,AALLLNRSESDIRYLI/3YVSQQG 
GQRSTPLI IAARNGHAKWRLLLEHYRVQTQQTGTVRFDGYVID 
GATALWCAAGAGHFEWKDLVSHGANVNHTTVTNSTPLRAACFD 
GRLDIVKYLVENNANISIANKYDNTCLMIAAYKGHTDWRYLLE 
QRADPNAKAHCGATALHFAAEAGHIDIVKELI KWRAAI WNGHG 
MT PL KVAAE S C KAD WELLLSHAD CDRRS R I E ALELLGAS FAND 
RENYDIIKTYHYLYLAMLERFQDGDNILEKEVLPPIHAYGNRTE 
CRNPQELES I RQDRDALHMEGIi I VRERI LGADNIDVSHP 1 1 YRG 
AVYADNME FEQC I KL WLHALHI/RQKGNRNTHKDIiLRFAQVFS QM 
IHLNETVKAPDIECVIiRCS VLB I EQSMNRVKNI SDADVHNAMDN 
YECNTLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFTLLHLAVNSNTPVDDFHTNDVCSFPNALVTKLLLDCGAEVNA 
VDNEGNS ALH 1 1 VQYNRPI SDFLTLHS III SLVEAGAHTDMTNK 

QNKTPLDKSTTGVSEILLKTQMKMSLKCLAARAVRANDINYQDQ 
IPRTLEEFVGFH 


6646 


176 


890 


PSSRMNHLPEDME^ALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWIIELNVNGGIENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVJblLAYAVCRLRHWWAIAL 
TTAVTS AFLIiAKVI LS KLFSQGAFGYVLP IIS FI LAW IET WFLD 

FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GS EEAEE KQDS E KPLLE L 


6647 
" 6648 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGR E KKG I SD VRRT FCL FVTFDLL FVTLL W 1 1 ELNVNGGI ENTL 
EKE VMQ YDYYS S YFDI F LLAVFR FK VL I LAYAVCRLRH W W AI AL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVL PQEAEEENRLL I VQDAS ERAALI PGGLSDGQFYS PPE SEA 
GSEEAEEKQDS EKPLLEL 




413 


897 


RWCWNCFTKYFNSPPEDIDHKDSYLITRSIMAEPDYIEDDNPEL 
IRPOKLlNPVKTSRNHQDIiHRELLMNTQKRGIiAPO^rKPBLQKVME 
KRKRDQVI KQKEEEAQKKKSDLE IELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


W I PRAAGI RHE VKWD VKE I MSQHN I Y VDALLKE FEQFNRRLNE V 
SKRVRIPLPVSNILWEHCIRLANRTIVEGYANVKKCSNEGRALM 
QLDFQQFLMKLEKLTDIRPI PDKEFVETYIKAYYLTENDMERWI 
KEHREYSTKQLTNTLVNVCLGSHINKKARQKLLAAIDDIDRPKR 


6650 
6451 


32 


765 


LVPLVFS LLVQS CKQVYRS I AMKFVPCLLLVTLS CLGTLGQAPR 
QKQGSTGEE FH FQTGGRJD S CTMR P SS LGQGAGE VWLR VD CRNTD 
QTYWCEYRGQPSMCQAFAADPKSYWNQALQELRRLHHACQGAPV 
LR PS VCREAG PQAHMQQ VTSSLKGS P E PNQQPEAGTPSLRPKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLISFFRG 




3425 


1353 

1 

1 


AKELLKVGDFSLCAGPYQNTADTMENLSKEPIiASFVSESFblSA 
CG I AT EHVKI DNS GEGLTAEAG SETLS RDGEVG VNSDMHYE LSG 
DSDLDLLGDCRNPRLDLEDS YTLRGS YTR KKDVPTDGYE S S LNF 
HNNNQEDWGCSS WVPGMETSLPPflHWTaavK'K'i? tt vmrn n vt m -r n 
DLHG I LRTYAN FS ITKELKDTMRTSHGLRRHPS FSANCGLPSS W 
TSTWQVADDLTQNTLDLBYLRFAHKLKQTIKNGDSQHSASSANV 
FPKES PTQIS IGAFPSTKISEAPFLHPAPRSRSPLLVTWESDP 
RPQGQ PRRG YTAS SLDS S S S WRERCS HNRDLRNS QRNHTVS FHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSAS YEDI I IDVCTNLHVKLRS WKEA 
CKSTFLF YLVETEDKS FFVRTKNLLRKGGHTE I E PQHFCQAFHR 
ENDTLIIIIRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 
LDHTYQELFRAGGFVISDDKILEAVTLVQLKEIIKILEKLNGNG 
WKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 
fHQCDS RS STKAE I LKCLLNLQ I QHI DARFAVLLTDKPT I PRE V 
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amino acid 
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Predicted end 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


~ 652 






FENNGI L VTD VNNF I ENI EK I AAP FRS S Y W 




2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL *" 

PFPLFLPTRPAERAWIRSRRASEWVGKMEVPRLDHALNSPTSPC 

EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 

T CDS F KI S WEMDS KS KDR I TH Y FI DLNKKENKNSNKFKHKDVPT 

KLVAKAVPL PMTVRGHWFLS PRTE YTVAVQTAS KQVDG DYWS E 

WS E 1 1 E FCTAD YSKVHIiTQLLE KAE V I AGRML KFS VF YRNQHBCE 

YFDYVREHHGNAKQPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 

TGKPPQDS P YGRYRFE IAAE KLFNPNTNLYFGDFYCMYTAYH YV 

ILVIAPVGSPGDEFCKQRLPQLNSKDNKFLTCTEEDGVLVYHHA 

QDVILEVIYTDPVDLSLGTVAEITGHQLMSLSTANAKKDPSCKT 

CNISVGR 


6653 

i 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
R VAAAASRGADDAME S S KPGPVQVVI>VGKOQHSFEIjDEKAIAS I 
LLQDHIRDLDWWS VAGAFRKGKS FI LDFMLRYLYSQ KESGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQSTVKDCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTEYGRLAMDEIFQKPFQTIjMFLVRDWSFPYEYSYGIi 
QGGMAFLDKRIiQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATS PDFDGKLKD IAGEFKEQLQAliI P YVLNPSKLMEKE ING 
S KVTCRGLLE YFKAYI KI YOX3EDLPHPKSMIiQATAEAYNIiAAAA 
S AKD I YYNNMEE VCGGEKP YLS PD I IiEEKH C E FKQLALDH FKKT 
KKMGGKDFS FR YQQELEEEI KELYENFCKHNGSKNVFSTFRTPA 
VLFTGIVALYIASGLTGFIGLEWAQLFNCMVGLLLIALLTWGY 

IRYSGQYRELGGAIDFGAAYVLBQASSHIGNSTQATVRDAWGR 
PSMDKKAQ 


6654 


1 


705 


RTSLSPSQCSSFNI^^SAGMQILGVVLTLLGWVNGLVSCAJJPM 
WKVTAFIGNS I WAQWWEGLWMSC WQSTGQMQCKVYDSJaLAL, 
PQDLQAARALCVIALLVALFGLLVYIiAGAKCTTCVEEKDSKARL 
VLTSG I VF V I S G VLTL I P VC WTAHAVI RDFYNPLVAEAQ KRELG 

ASLYLGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 


1<S 


KDAYMFKKGLIALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGA 
INI PLKE VKERI ATAVPDKNDTVKVYCNAGRQSGQAKE I LS EMG 
YTHVENAGG LKD I AMPKVKG 


6656 


2 


1212 


TELPPRPANIAIQPPLSPLRAIiAPLPEKPGAVPPPQKRMAKVAir" 

DLNPGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 

KCSQEDHCLTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 

KRNRM I MTNP IATGKDPTFDTITYEWAPPGVTQKLGLQ YMELI P 

KEKQPVTGTEGAFYRHRQLMHQLPIYDQDPSRCRGLLENELKLM 

EEFVKQYKSEAIiGVGEVAliPGQGGLPKEEGKQQEKPEGABTTAA 

TTNGSLSDPSKEVEYVCBLCKGAAPPDSPWYSDRAGYNKQWHP 

TCFVCAKCSEPLVDLIYFWKDGAPWCGRHYCESIiRPRCSGCDEI 

IFAEDYQRVEDLAWHRKHFVCEGCEQLiIiSGRAYIVTKGQLLCPT 
CSKSKRS 


6657 
6658 J 


830 
35 


2120 

■ 

J 
1 

855 I 


XiLTCQERAGUCLLSASTMliEJvVYWSPKKVADWIiLENAMPEYCEP 
LEHFTGQDLINLTQEDFKKPPLCRVSSDNGQRLLDMIETLKMEH 
HLEAHKNGHANGHLNIGVDIPTPDGSFSIKIKPNGMPNGYRKEM 
IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISVVHER 
VPPKEVOPPLPDTFFDHFNR VQWAFS I CEINGMILVGLWLI QWI* 
LLKYKS I ISRRFFCI VGTLYLYRCITMYVTTLPVPGMHFNCS PK 
LFGDWEAQLRRIMKLIAGGGLSITGSHNMCGDYLYSGHTVMLTL 
r^LFIKEYSPRlUiWWYHWICWI^WGIFCILIJ^HYTVDVVV 
\YYIOTRiFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 

^VQGIVPRSYHWPFPWPWHLSRQVKYSRLVNDT 

iCCAIiGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGIaTKRMLM 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T« Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








FDPVPVKQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 

QTPEGLSHGIQMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 

SPGLSMPSSSPPIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 

IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 

MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPOALLQE 


6659 
6660 


18 




EPQRGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNLHSLPGL 
RGLTS ISRNQLQCTWAMRVINNYQRRWKNQNTFLIiATFANWNV 
CGNPTITCPHNRTLNNCHHSGVQVPLMYCNLTTPSPQNISNCRY 
AQTPANMFYI VACDNRDQRRDPPQYP WPVHLHTI I 


F 6661 


514 


1707 


CAASLDCRHHLCE PDMKtiVWPSAKLIjQAAAGASARACDS VTSNV 
LPLLLEQFHKHSQSSQRRTILBMLLGFLKLQQKWSYEDKDQRPL 
NGFKDQLCSLVFMALTDPSTQLQLVG IRTLTVLGAQP DLLS YED 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VPKLAE BLR VGESNLTNG DEP TQ CSRHL CCLQAL SAVSTHPS I V 
KETLPLLLQHLWQVNRGNMVAQSSDVIAVCQSLRQMAEKCQQDP 
ESCWYFHQTAIPCLLALAVQASMPEKEPSVLRKVLLEDEVLAAM 
VS VI GTATTHLSPELAAQSVTHI VPLFLDGNVS FLPENS FPSRF 

QPFQDGSSGQRRLIALLMAFVCSLPRNVSEHIWEVLLFNLDKVT 
PG 


6662 


179 


430 


GVHAASGTI.SATWLAE AKMFDSLAKAGKYLGQAAKLM IGMPD YD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 




185 


423 


RSLPKPAPAQPASIHCARFSGVTPPTAKTAMSDGMTAFNALMYC 
GPKADDGNI FSACAPASSAVKASVSVAQPGQAVIP 


6663 
6664 


3 


1005 


KPVLSSRVDDFVPPLPETSGRRKKLERMYSVDRVSDDIPIRTWF 
PKENLFS FQ TAS TTMQAISNFR KHLRMVGSRR VKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVMIilSSKVPKAEYIPTIIRRDDPSII 
PILYDHEHATFEDILEEIERKLNVYHKGAKIWKMLIFCQGGPGH 
LYLLKNKVATFAKVEKEEDMIHFWKRLSRLMSKVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFBSPLELSAQGKQMIET 
YFDFRLYRLWKSRQHS KLLDFDDVL 


6665 


58 


968 


PRLLRLPRSVWMDSPWDELALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNPISSWFTAMLHCFGGGILSCLLLAEPPLKF 
LANHTNI LLAS S I WYITFFCPHDLVSQGYSYLPVQLLASGMKEV 
TRTWKIVGGVTHANSYYKNGWIVMIAIGWARGAGGTIITNFERL 
VKGDWKPEGDEWLKMSYPAKVTLLG3VIFTFQHTQHLAISKHNL 
MFLYTIFIVATKITMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
S CE KKS EARS P SNGVG S LAS KP VD VAS DNVKKKHTKKNE 


6666 


171 


1278 


DERRLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQKTPCSVHTATGPEPRKLPLLPPDSPNSGYPKEPA 
ALCPGI PSPCRMTHQDLS ITAKL1NGGVAGLVGVTCVFP IDLAK 
TRLQWQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKAIKLAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVVVTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 
ATLIAWELLRTQGIiAGL YRGJLGATLLRDIPFS 1 1 YFPLFANLNN 
LGFNELAGKAS FAHS FVSGCVAGS I AAVAVTPLDVLKTR IOTLK 
KGLGEDMYSGITDCAR 




498 


286B 

J 
C 
I 
£ 


MTTFLp v ygWMAGFS FGTFGNPPMES PSAWQTIHQPFI VS CLTL 
fiS PG CW PQP I Q KEG VGLWD I R KPQS S LLR YGGNLSLQS AMS VR F 
^SNGTQLLALRRRLPPVLYDIHSRLPVFQFDNQVYFNSCTMKSC 
2F AGDRDQ Y 1 LSGSDDFNL YMWR I PAD P EAG G I GR WNGAFM VL 
CGHRS IVNQVRFNPHTYMICSSGVBKI IKIWSPYKQPGCTGDLD 
3RIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
? FDSLVRREIEGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
5ASLPRSPPPTVDESADNAFHLGPLRVTTTNTVASTPPTPTCED 
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SSQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I»Isoleucine , K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TaThreonine , v=Valine, 
r, - lt yp to P n a"f i a iyrosme , a - unknown , *=Stop 
Cadon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








AASRO^RLSAIiRKYQDKRIiALSNESDSEENVCEVELDTDLFPR 
P RS PS PEDES S3SSSSSSS EDE E ELNERRAS TWQRNAMRRRQKT 
TREDKPS AP IKPTNTY I GEDNYDYPQIKVDDLSSSPTSS PERS T 
S TLE I QPSRAS PTSD I ES VERK I YKAYKWLR YS YIS YS NNKDG E 
TS Li VTGEADEGRAGTSHKDNPAPS S S KEACLN I AMAQRNQDL P P 
EGCSKDTPKEETPRTPSKGPGHEHSSHAWAEVPEGTSQDTGNSG 
&vt.tiJb fcTKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETI CANHNNGRLHPRPPHPHNNGQNLGELEVV 
AYSSPGHSDTDRDNSSLTGTLLHKDCCGSEMACETPNAGTRliDP 


6667 


171 


1310 


ABE VERLAAMRS DSLV PGTHTPP I RRRSKPANLGRI PKP WKWRK 
KKSEKFKHTSAALERKISMRQSREELIKRGVLKEIYDKDGELSI 
SNEEDSLENGQSLSSSQLSLPALSEMEPVPMPRDPCSYEVLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQLQYGSHGQHLPSTTGSL 
PMH P3fc»C3<M I DELNKTLAMTMQRLES SEQRVP CSTSYHS SGLHS 
GDGVTKAGPMGLPEIRQVPrWIECDDNKENVPHESDYEDSSCL 
YTREEEEEEBDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKNILPRQTDEERLELRQQIGTKL 


6668 
"~"&669 


714 


3 58 


TLAVATGPALTLRCHVCTS S SNCKHS WCPAS SRFCKTTNTVEP 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKLH 
NAAP TRTALAHS ALS LGLAL S LLAV I L»APSL 




4 59 


1207 


KDEETRKDYDY^lLDHPEEYYSHYYHYYSRRLAPKVDVRVVI■l 1 VS 
VCAISVPQFFSWWNSYNKAISYLATVPKYRIQATEIAKQQGLLK 
KAKEKGKNKKS KEE IRDEEENI IKNI I KSKID 1 KGGYQKPQ I CD 
uuitr sj± a LiAPlfHL»CS YI VWYCRWI YNFNIKGKEYGEEERLYI IR 
KSMKMSKSQFDSLEDHQKETFLKRELWIKENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 

> 


184 


594 


VAR I * GEAAKMS SEP PPPYPGGPTAPLLEEKSGAPPTPGRSS PA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM • 
P PG FYP P PGPH P PMG Y YP PG P YTPG P YPG PGGHTATVLVP SGAA 
XTVTV 


€671 


1 


763 


D PAE KP RS APNMAGGRCGPQLTALLAAW I AAVAATAG PE EAAL P 
PEQSRVQPMTASNWTLVMEGEt-IMLKPYAPWCPSCQQrDSEWEAF 
AKNGEII#QISVGKVDVIQEPGLSGRFFVTTLPAFFHAKDGIFRR 
YRGPG I FEDLQNY I LEKKWQS VE PLTGWKS PAS LTM SGMAGL FS 
ISGKIWHLHNYFTVTLGIPAWCSYVFFVIATLVFGLSMDLVIi*V 
ISQCNWDPPYRHVS * /RPSTNLiGVHTAHTSEHLRL 


6672 


304 


1089 


******* * \jc i mi? ov3Mor olio v r iVi-»aiMAi-LPK-»>>L j l la il « A I AfllAriT 
G VI FFLALLLC I ALLS S YS IHLLLTCAG I AG I RAYEQLGQRAFG 
PAGKVVVATVICLHNVGAMSSYLFIIKSELPLVIGTFLYMDPEG 
DWFLKGNLLIIIVSVLIILPLALMKHLGYLGYTSGLSLTCMLFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
MFHS*LTG VLTQ WPIMAFAFVCHPGGAG PS ITELCRAFQAQD 


6673 
6674 


1116 


1963 


LQ I QTHHTHHGAR VTHLG S HQLLANAGTMLCRQQS S S MAPAFS Q 
S VTCGP S PCVRKQES ATKCLHI GACGSDLWARGWEQG+ G* GLNV 
W LC PCVA FHRGARPQAEEGGARWNTS LVS S P W I PPNP * HS S IGAE 
NAVPRP*QG* KVNPSGQERQS \ WVLPLPVPGEPLKLPGltPG*NK 
SFSRV/SGSKGKWILPRQLM*AS+R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTQKPRFSDTGW 
FGAGHCHS S CDFTRKGAAGGPG 




1 


440 


LEFDYMCQYDY VEVRDGDNRDGQI I KRVCGNERPAP IQS IGSSL 
HVLFHSDGSKNFDGFHAIYEEITACSSSPCFHDGrCVLDKAGSY 
KCACLAGYTGQRCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTRVAFFLT 
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ID 
NO: 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine,. C=Cysteine, D-Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
ScSerine, T=Threonine, V« Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


667S 


277 


1678 


GNWPTERMAPLDNPTIIIiAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRI»ERIiRKERQNQI KCKNIQWKERNS KQSAQEIrKSLFE 
KKS LKE KP P 1 5 GKQS I LS VRLEQC PIjQLNNPFNE YS KFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLrcWQ 
YTSEGREP KLNDNVS AYCLH IAEDDGE VDTDF P PLDSNEP I H KF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGV3^EDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQli/GCAIiFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS * DS HKC+ EG IS GDKVE I DPVTNQ 
KASTXFWI KQKPI S IDSDLI*CAC\DI*AEE 


6676 


277 


1678 


GNWPTERMAFLDNPTIILAHIRQSHVTSDDTGMCEMVLIDHDVD 
LBKIHPPSMPGD3GSEIQGSNGBTQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQ I KCKN IQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRIjLPMTVVTMASARVQDIilGlilCWQ 
YT SEGRE P KLNDNVSAYCLHI AEDDGE VDTDF P PLDSNE PI HKF 
GFSTLALVEKYSS PGLTS KESIiFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH . 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS *DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWI KQKP IS IDSDLI*CAC\DLAEE 


6677 


277 


1678 


GNWPTERMAFLDNPTI I LAH I RQSHVTSDDTGMCEMVL, I DHDVD 
LE KI H P PSMPG DSGSE I QGS NG ETQG Y VYAQS VDI TSS WD FG I R 
RRSNTAQRLERLRKERQNQI KCKNIQWKERNS KQSAQEI»KSLiFE 
KKSLKEKPPI SGKQS ILSVRLEQCPLQLNNPENEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCIiHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I IjL»KAVKRRKGS qkvsgs RADGVFEEDSQI D i atvqdmls sh 
HYKSFKVSMIHRIiRFTTDVQL/GCALFPGVLRKRAAPVDCIjRPS 
/uj 1 WKU2j,u±^Q.CGAACAAIiRS*DSHKC* EG ISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDIjLCAC\DLAEE 


6678 


221 


865 


GPSNQSS3SLSLIVTGCSSYWS*INDTCTILRVLSStfFGRQ+LR "" 
PFPCSQLPMSQGCLWHLDCCCPWVPYIPGQQWRKGRQRMRN*QS 
LLGSDQESVGLEDLCVFVNFLLHVLLGLFP* PHELFLLPVVDLG 
FLF P LLjLjQGG chcl vlpanlvsqapq IGKLSCRLQTHDLEGSRN 
HHPL FIjWGR WD AVKHL ET VQS GLAS LGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRS PGQNWVKTVDGWKRFLDEKSGSFVSD~ 
SS YCNKE VYNKBNLFNSLNYD/SCSQEEKEGHAE *QNQNS \DFH 
QEKWIYVHKGSTKERHGYCTLGEAFNRLDFSTAILDSRRFNYW 
RliIiELIAKSQLTSLSGIAQKNFMNILEKVVLKVLEDQQNITLIR 
ELLQTLYTSLC TL VKRVGKS VLVGNI NM WVYRW ET I LHWQQQLN 
NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGIjF 


6680 


1498 


2951 


PLCTLPLMPSAIiPGWAGERWEKQWPLA/ PGPGTWQTPVGS ISEE 
P\RKNEPDTHCPRGEARPEV*HLPKPHSPGSEGAEIQTSA*ALP 
/NQVS PPQPM * GAEENGDQRGGKEEAGEELHRSS SGLTAAPGF? 
EVHRNLQTFPGLPSRGGGP/GGAGTQGSWAPGEQPP/SPLLPAS 
MQRSQAGLPGWEAGLVESPTHHIPALRPSGTNATGEAFPSTTCS 
SGP \ PAP PGPTGLRPGGGS S SGGHG * * PGLPVGKV\GALGAAQD 
PQS QGRG PTQG TVGTEMLLS GLGS AKACPAARPAVP * LPS DPAS 
TIPKKGTRGFGEGPGVLQERNRWWGRAQGFTSADAAGTAPPGV 
♦LPAPI.SQPPGATEPQVRACGMAPPSPGTSGRLVAWGRHPGPQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHSPLGHGRAPCPRRCWH* 
WQDP PSS PRTGCLPGI PARQAYSAPRTRS RPG I RTGRAAYGF I R 
FQGGGGG 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

COrreflnnnHi tut 

to first 
1 amino acid 
residue of 
amino acid 
J sequence 


1 Predicted end 

nucleotide 

location 

corresponding 
1 to first 

amino acid 
[ residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, OCysteine, D=Aspartic Acid, Es 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H-Histidine, I=Xsoleucine, K=I,ysine, 
L=I)eucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyro©ine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 


6681 
6682 




j 511 


INyiYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSERE 
KM T VG VLTQ T VG P WS RPGA YLS KQLDGVS KGWP PCPRALAATAL 
LAQEADBLTLRQNLNRKSPHA\WTLINTKGHH*LINARLTRYQ 
TLLCENPHKTIEVSNT/LNPATLLLVTESPVKHNCLEVLDSVYS 

SRPNLRDHP*TSVDWELYVIX5SGFANPCKVTLKKETSPAPVTPR 
S 


6683 


1 109 


1238 


TVLCGAMQVSSLNEVKIYSLSCGKSLPEWLSDRKKRALQKKDVD 
VRRR IELIQDFEM PT VCTTI KVS KDGQ Y I LATGTYKPR VRC YDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTR I PKFGRDFS YHYPSCDLYFVGASSEVYRLNLEQGR YLN 
PLQTDAAENNVCD I NS VHGL FATGT I EGRVECWDPRTRNR VGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQ YGL P I KS VHFQDS LDL I LS ADSR I VKMWNK 

NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVX, 
GPAPRWCS FLDNI,TEELEENPESNE 


6684 


109 


j 1238 


TVIiCGAMQVSSL^VKIYSLSCGKSLPEWLSDRXKRALQKKDVD" 

VRRRIELIQDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 

YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 

FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYIiN 

PLQTDAAENNVCD INS VHGItFATGTIEGRVECWDPRTRNRVGLL 

D\AP*TVSQQIQR*TSLPTISALKra\GALTMAVGTTTGQVLLY 

DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLILSADSRIVKMWNK 

NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 

GPAPRWCSFLDNLTEELEENPESNE 




111 


527 


CjlxRGGTSRGkAGREPKFAAGVLCWAGFCQSPCPPGGRGREAPA— 
PP \ SGRRHA* R PA* WLGG PGGDSGGREEGGS /GELQRAMES KMG 

RNIVQNYR QEPHW ^ 


5685 | 
6686 ! 


258 | 


14 73 


KLLGDNFE(iFCNKFELSDSENGSNS*QSPL\FDRLFDPDPQKVL 
QGVIDMKNAVIGNNKQKANLIVLGAVPRLLYI,LQQETSSTELKT 
ECA WLGS LAMGTENNVKSIiLDCH 1 1 PALLQGLLS PDLKF I EAC 
LRCLRTIFTS P VTPEELL YTDATVI PHLMALLS RSR YTQEY I CQ 
IFSHCCKGPDHQTILFNHGAVQNIAHDLTSLSYKVRMQALKCFS 
VLAFENPQVSMTL VNVL VDGELLPQ I FVKMLQRD KP I EMQLTS A 
KCLTYMCRAGAIRTDDNCIVLKTLPCLVRMCSKERIJ^EERVEGA 
ETLAYL I EPD VELQRIAS X TDHL lAMLAD YFKY PSS VS AI TD I K 

RQDHDLKHAHELRQAAFKLYASLGANDEDIRKKVSLGEGRPPVL 
TASRQGVTST 


6687 


310 


927 


DSVTFDDI^VUFTPKEWTLUJPTQRNLYRDVMLENYKNIiATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKEIiALQQDV 
LGEPTSSGIQM I QSHNGQEVSDVKQCGDVSSEHS CLKTHVR TON 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF* IiQIiTLGKSFH* S IHT 


6688 


181 


915 


KAMIiEAPYKKEEDEQQRKEVKKDYPSNTTSSTSNSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
S RDHRRE DRVH YRS P PLATG E P VDNLS PEERDART VFCMQLAAR 

IRPRDLEDFFSAVGKVRDVRIISDRNSRRSKGIAYVEFCEIQSV 
PLAIGLTGQRLLGVPIIVQASQAEKNRLAAMANNLQKGNGGPMR 
LYVGSLHFNITEDMLRGIFEPFGKV 




1025 


1 

] 

\ 
r 
\ 


toVFNYPRV^HKCPDSCWkFKFgPlQLOPYIlXSi'^^EKPPI^F 
SEPGLPR/ SATARMATAAAPPNSS IDLPSDSGMGFI S PAGDSLD 

bPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
3TSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
/AVICGSKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 
CMSELEELFSLFSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 
^CQLWliADSDTGKLSDCQEVVTVGDSGGLTCPELSLGRM*MSLL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=» Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, Fa Phenylalanine, G^lycine, 
H=Histidine, i=*isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline,' Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V=Valine # 
W=Tryptophan, Y»Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S S AV I PGY S S SSDSRLNT VPTVDLLCP FQTKS ST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
Q RNS LTARQLAMSL * ATKF * RNACNPNCLS SXKSAL* L S LNQRF 
GGSAS RKPGNI SFNSQKCSALS YCCNFVI KPREVSVSS ENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGQTFEYLKREHSLSKPYQGVGTGSSSLWNLMGNAMVMTQ 
YIRLTPDMQSKQGALWNRVPCFLRDWELQVHFKIHGQGKKNL\H 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQEQKKMIQAQESITLBDVAV 
DFTWEEWQLLGAAQKDLYRDVMLENYSNLVAVGYQASKPDALFK 
LEQGEQLWTIEDGIHSGACSDIWKVDHVLBRLQSESLVNRRKPC 
HEHDA FEN 1 VHCS KSQ FLLGQNHDI FDLRGKS LKSN LTL VNQS K 
GYE I KNSVEFTGNGDS FLHANHERLHTAI KFPASQKL ISTKSQF 
ISPKHQKTRKLEKHHVCSECX3KAFIKKSWLTDHQVMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/ GKG F I QKTCLIAJIQR F1ITER 


6692 


178 


939 


WIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHS QGFNKLAETLR WCLNtiG I LE VTV YAFS I ENFKRS KS EV 
IX3LMDLARQKFSRLMEE KEKIiQKHG VCI R VLGDLHLL PLDLQE L 
IAQAVQATKN YNKCFLNVCFAYTS RHE I SNAVREMAWG VEQG LL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WIKEGELSJLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE " 
RQEGHSQGFNKLAETLRWCLNLG I LE VTVYAFS I ENFKRS KSEV 
IXSLMDLARQKFSRLMEEKEKI^KHGVCIRVLGDbHLLPLDLQEL 
IAQAVQATKN YNKCFLNVCFAYTS RHE I SNAVREMAWG VEQGLL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
S CLV FQP VLWPE YTF WNLFE AI LQFQMNHS VLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHEIiPVR 
E VHS LGQ I L PQDGLTAEAG PFEAQD P WGS PG IS LPAAH I G FAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHW 
NAAAG KFQ VDTGAKFYRGMS LE Y YG I EADDNPFFDLS VY FL P 




6695 


292 


813 


S LLLHLAPPGAYTPSQ P LS S VSTETAS S VRRQAAESRQHE L P VR 
E VHSLGQI LPQDGLTAEAGPPEAQDPWGS PGI SLPAAHIGFAAA 
LAVGP S G CHTEP \ FDE VW PS LFLGDAYAARDKS KL I QLG ITHW 
NAAAG KFQVDTGAKFYRGMSLEYYG IEADDNPFFDLSVYFLP 




6696 


1 


782 


r r. v Kvj«. v v JrAAPlo & SMBPLl*iiAWSYrRRRKFQLCAD 
L CTQM L EKS P YDQAAW I LKARA LTEMVY I DE I DVDQEG IAEMML 
DENA I AQ VPRPGTSLKLPGTNQTGGP S QAVRP ITQAGRP ITG FL 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 
i bPix* XCiUjSRLNLTKYSQKPKLAKALIEYI FHHENDVKTALD 
LAALSTEHSQYKDWWWK/DQIEKCYYRVGMYREAEKQIKSS 


"1:^97 
6698 


3 
668 


782 
754 


PPLFLRRLNSRALRPGSRKVMAWPASLSGQDVGSFAYLTIKDR 
I PQILTKVI DTLHRHKSEFFEKHGEEG VEAEKKAIS LLSKLRNE 
LQTDKPFIPLVEKFVDTDIWNQYLEYQQSLLNESDGKSRWFYSP 
WLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQ1SLWGEISVDL 
SL\SGGES SSQNTNVLNSLEDLKPFILLNDMEHLWS LLSNCK 
VGSCACAGSCKCKECKCTS CKKSECRAFP 




6699 ■ 


325 " " 


492 


EGELP/PARRVLPRAMTASAQPRGRRPGVGVGWVTSCKHPRCV 
LLGKRKGSVGAGSFQLPGGHLEFGETWEECAQRETWEEAALHLK 
NVHFASVVNSFIEKENYHYVTILiMKGEVDVTHD^ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticfe 
(A-Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=sValine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES KR 1 1 YNHAF F FQES KWSGG I LQ 


6700 


1098 


1392 


TQCWRSSTPGMRTHFRTQP/RLEOGQGFSQQENGHCMDTNECIQ 
FPFVC PRDKPVCVNTYGS YRCRTNKKCSRGYEPNEDGTACVERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAG PRTR VRRAAAFEG Q PS PS PGLG PTSDKAAAP RT P KRR R LW 
RQRQ/HPAMLCYVTRPDAVLMEVEVEAKANGEDCLNQVCRRLGI 
IEVDY FGLQFTGSKGESLWLNLRNR I S QQMDGLAPYRLKLRVKF 
FVEPHIiILQEQTRHIFFLHIKEAI*liAGHLLCSPEQAVELSAI»LA 
QTKFG D YNQNTAKYNYEELCAKELS S ATLNS I VAKHKE LEGTSQ 
ASAEYQVLQIVSAMENYGIEWHSVRDSEGQKLLIGVGPEGISIC 
KDD FS P INR IAYP WQMATQS GKNV YLTVTKESGNS I VLLFKM I 
STRAAS GL YRA ITETHAFYRCDTVTSAVMMQ YS RDL KGHLAS L F 
LNENINLGKKYVFDI KRTSKEVYDHARRAIjYNAGWDLVSRNNQ 
S PSHS PLKSSES S MNCSSCEGLSCQQTRVLQEKLRKLKEAMLCM 
VCCE EEINSTFCP CGHTVCCESCAAQLQ VGESAAHFCIiQPHLS l 
LLTGSRSQVLAR 


6702 


397 


1971 


PLAKFLKIiDLVNVLCLPMEDVFLFYRTdFCSMGLGSSCHIiSLPK 
RAE ALL CSRKATWRDLVAVRMAEEQE FTQLCKLPAQPSHPHCV 
NNTYRS AQHSQALLRGLLALRDSG ILFDWLWEGRH IEAHR I L 
LAAS CDYFKGMFAGGLKEMEQEEVLIHGVS YNAMCQI IjHFI yts 
EDELSIjSNVQETLVAACQLQI PEI IHFCCDFLMS WVDEENI I.DV 
YRIiAEBFDLSRLTEQIiDTYILKNFVAFSRTDKYRQIiPLEKVYSIi 
LSSNRLEVSCETE V YEGALL YHYSLEQVQADQI S I.HE PPKI/LET 
VRFPLMEAE VLQRLHDKLDPS PLRDTVASALMYHRNESLQPS LQ 
SPQTEI#RSDFQCWGFGGIHSTPS\MSSATRPKYLNPLiLGEWKH 
FTAS LA PR MSNQG IAVLNNFVYL IGGDNNVQGFRAESRCWR YD P 
RHNRW FQ I QSLQQEHADLS VCWGR Y I YAVAGR D YHNDLNAVER 
YD PATNS WAYYA P LKREV YAHAGATLEGKMYX TCGRKGR I T 


6703 


4S 


1244 


GVGPRAAAMPLELELCPGRWVGGQHPCFIIAEIGQNHQGDLDVA " 
KRMIRMAKECGADCAKFQKSELEFKFNRKAIjERPYTSKHSWGKT 
YGEHKRHLEFSHDQ YRELQRYAEEVGI FFTASGMDEMAVE FLHE 
LWPFFKVGSGDTNNFPYLEKTAK/TRGWHSVLRDVCGVQLNDE 
TSS WD VLGRVRTS KE KVLMVLVLD YSGRPMVI SSGMQSMDTMKQ 
VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 
IG YSGHETGIAIS VAAVALGAKVLERH I TLDKTWKGSDHSASLE 
PGELAELVRSVRLVERALGSPTKQLLPCEMACNEKLGKSVVAKV 
KI PEGT ILTMDMLT VKVGEPKG YPPED IFNLVGKKVLVTVEEDD 
TIMEE 


6704 


82 


1007 


TMNTRNRWNSGLGASPASRPTRDPQDPSGRQGELSPVEDQREG 
LEAAP KGP S RE SWHAGQRRTSAYTLIAPNINRRNB IQR IAEQE 
LAN L E KWKE QNRAKP VH L VPRRLGGSQSETE VRQKQQLQLMQS K 
YKQXLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQYKTAEFL/RQTEHR1ARQKCLSKCCLWPTILN 
MGQKLGLQ\DSLKAEENRKLQKMKDEQHQKSELLELKRCX3QEQE 
kaiu.hu 1 bJlRK VNNAFJjDRLQGKSQPGGLEOSGGCWNMNSGNSW 
GI 


6705 


2 


786 


RLCRNSARVPCGWSASRSLGEGAGPZGPLRGPHPRAGGTGTSFt 
S Y KRXGG I MS riAAF YGGKS I L I TVATGFLGKE LMEKLFRTS PD 
LKVIYILVRPKAGQTLQHRVFQILDSKLFEKVIEVRPNVHEKIR 
AIYADIiNQNDFAISKEDMQELLSCTWIIFHCAATVRFDDTLRHA 
VQLNVTATRQLLLMAS QM P KLEAF IHX STAYSNCNLKH I D EVI Y 
PCPVEPKKIIDSLEW\LDDAIIDEITPKLIRDWPNIYTYTK | 


5706 


130 - 


531 


PTHSS S SHSQEMLGKLNMLRNDGH FCDI T 1 RVQDKI frahkwl 
AACS DF FRTKLVGQAED ENKNVLDLHHVTVTGF I PLLE YAYTAT 
LS 1 NTEN I IDVIiAAAS YMQMFS VASTCSEFMKS S ILWNT PNSQ P 

EK | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aepartic Acid, E- 
Glutamic Acid, F-Phcnylalanine, Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Meh h ion -5 np M-,ncnavar»i nn 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«valine, 
W=Tryptophan, Y«Tyrosine, X^Unknovm, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKPHFEKKGPPSTCQBRLYESRSRWPCIS~ 

GMVWGWTAVNGSW*GGQbRCVCVCTSHSSDSTRSSQRASKCHS 

FFILSQ*KT*SSWENVrVFAKYSRIYSYGHSCSKGRGD*DFK*KV 

SQAR*SRFCGLCNPCGHCGLDINIjRGGSSPWTDKHSCVHNNl»LC 

NRRVFSbliCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 

TD*LPK*GPGYIQHFHCDSNXLCILYNISFNLFSYSF*GVARYA 

C * RCHW YFE WLLYNHCGD I LVACL * RRQL* SSQ 


6708 
~ 6769 


115 . 


1729 


i vvjo «*Ko^K&j/i'Vv»JKyijijijTUltUAQAAGSPQGGMALQVELVPT 
GEI IRVVHPHRPCKLALGSDGVRVTMESALTARDRVGVQDFVLL 
ENFTSEAAFIENLRRRFRENL2YTYIGPVLVSVNPYRDT.QIYSR 
QHMER YRGVSFYEBP PHLLAVADTVYRALRTERRDQAVM IS VES 
GAG KTDATKRLLQLYAETC PAPQRGGAVRDRLLQSN P VLEAFGN 
AKTLRNDNSSRFGKYMDVQFDFKGAPVGGKILSYLLEKSRWHQ 
NHGERNFHIFYQLLEGGEEETLRRLGLERNPQSYbYLVKGQCAK 
v o a x«urvoi/WAV v ta\J\u iv 1 DrTfcDE VEDLLS IAASVLiHLGNlH 
FAANEESNAQVTTEKQLKYLiTRLIjSVEGSTIjREAIjTHRKI I AKG 
E E LLS PLNL EQAAYARDALAKAVYS RTFTWL VGK 1 NRS LAS KD V 
ES PS WRS TT VLG LLD I YG FE VFQHWS FEQFCIW YCNE KLQQLF I 
ELTLKSEQEEYEAEGIAWEPVQYFNNKII CDLVEE KFKGI I \S I 
LDE\ECLRPGE 




3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 

TA^TTMWV V\rC VTi COV&Wt>r\T n AT Till mr\mT -m . .. ■ 

j. /u\ro«ia a *v, v & Ki<t>KiusE JED LEAL I AHFQTLDAKRTQTVELPCPP 
PSPRIiNASLSVHPEKDELILFGGEYFNGQKTFLYNEIiYVYNIRK 
DTWTKVDI PS PP PRRCAHQAVWPQGGGQLWVFGGEFAS PNGEQ 
^^JjwvjjhliA 1 tvTWEQVKSTGGPSGRSGHRMVAWKRQLILF 
GGFHES TRD Y I Y YND VYAFNLDTFTWS KLSPSGTGPT P RSG CQ\ 
IPSLPRAASSVYGGYSKQRVKKDVDKGTRHSDMP 


6710 


158 


980 


RHKMTNYRVESSSGRAARKMRIiALMGPAFIAAIG Y IDPGNFATN 

IOAGASFGYOLiIjWVWWAMT.MaMT.TriTr OJM/T rrv» t/\tt >nnT 

RDHYPRPWWFYWVQAEIIAMATDLAEFIGAAIGFKLILGVSIUL 
QGAVLTGI ATFL I LMLQRRGQKPLEKVI GGLLLFVAAAY I VELI 
FSQPNLAQIiGKGM VIPSLPTSEAVFLAAG VL \GATIMPHVT / YI 
WH S SliTQHLHGGS RQQR YSATKWDVA I AMTIAGF VN LA I MATAA 
SELNFYGHTGVA 


6711 
' 6712 


3 


347 


VTE CKTMTC KMS QLERN I * TM I NTLHH YS VKLGHPDTL IHG E FK 
ELVRTDLHN I LM XENKNDQAI * H I MEDLDTNAHMQI I FKEL IML 
MAMLTW3YHDNMHDADYGPGQQHRPG 


*713 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRLPPGENIDDWIAVHV 
VD FFNR INL I YGTMAERCS * TS CP VMAGGPRYE YR WQDERQ YRR 
PAKLS APRYMALLMDWI ESLI 




2485 


3 

• 


QARGyuSEDGEFEIQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 
QRKTIPVILDGKDWAMARTGSGKTACFLLPMFERLKTOSAQTG 
ARALILSPTREIALQTLKFTKELGKFTGLKTALILGGDRMEDQF 
AALHENPDII IATPGRLVHVAVEMSLKLQSVEYWFDEADRLFB 
MGFAEQLQE I IARI .PGGHQTVLFSATLPKLLVEFARAGLTEPVL 
IRLDVDTKX^EQLKTSFFLVREDTKAAVLILHLLHNVVRPQDQTV 
VFVATKHHAE YLTELLTTQRVS CAHI YSALDPTARKINLAKFTL 
GKCS TL I VTDLAARGLD I PLLDNV INYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGR VPQS WDE EDSGLOSTLE ASLE LRGLAR VADNAQ QQ 
YVRSRPAPSPESIKRAKEMDLVGLGLHPLFSSRFEEEEIjQRLRL 
VDS I KNYRSRATI FE INASSRDLCSQ VMRAKRQ KDRKAI AR FQ Q 
3QQGRQEQQEGPVGPAPSRJPALQEKQPEKEEEEEAGESVEDIFS 
B WGRKRQRSGPNRGAKRRR EEARQRDQEFYI P YR PKDFDS ERG 



543 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 

mini q r\f* •» An 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A*=Alanine, C=Cysteine, D=»Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R^Arginine, 
S«Serine, T^Threonine , V«Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








LSISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRQQLKWDRKKKR 
FVGQSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 
GRRRGI LTRRR PR TEE VGEAR PLAQAG C I PGPHAPRHPLQAESA 
LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLPPPPAPMAHIPSGGAPAAGAAPMGPQYCVCKVELSVS " 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF > 
SKKFVLDYHFEEVQKLKFALFDQDKSSt^RLDEHDFXGQFSCSLG 
TI VS S KK ITR PLLLLND KPAGKGL IT I AAQELS DNRVI TL S LAG 
RRLDKKDLFG KSDPFLEFYKPGDDGKWMLVHRTEVI KYTLDPVW 
K P FT VPLVS L CDGDMEKP IQ VMCYD YDNDGGHDFI GE FQTS VSQ 
MCEARDSVPLEFECINPKKQRKKKNYKNSGIIILRSCKINRDYS 
FLDYI LGGCQLMFTVG IDFTASNGNPLDPSSLHYINPMGTNEYL 
SAI WAVGQI 1 QD YDSDKMFPALGFGAQIjP PDWKVSHE FAI NFNP 
TNPFCSGVDGIAQAYSACLP | 


6715 
6716 " 


32 


493 


GPAGAESGSLHCLPATVQALAGAAHSPHGGQPPRRGPLIGSGMP ' 
GKPKHLGVPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLSIHSL 
PSGPSSPFPTEEQPVASWAIiSFERLIiQDPLGLAYFTEFLKKEFS 
AENVTFW KACERFQQ I PAS DT j 




1 


176 


gaggpaprsfgseepraalerdkmsaraaaakstameetaiweqH 

HTVTLHRVSLCCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTS YS I DDQSQQSYDYGGSGGPYSKQ YAgH 

ydysqqgrfvppdmmqpqqpytgqiyqptqaytpaspqpfygnn 

FBDEPPLLEELGrNFDHIWQKTLTVLHPLKVADGSIMNETDIAG 

p^fclafgatlli^kiqfgyvygisaigclgmfcllnlmsmt 
gvsfgcvasvlgycllpmillssfavifslqgmvgiiltagiig 

WCSFSASKIFISALAMBGQQLLVAYPCALLYGVFALISVF | 


6718 


290 


599 


kqsstvpgtilpslkwhnsglckfpetggkmttfkegltfkdvai 

VIFTBEELGIiLDPVQRNLYQDVMLENFRNLLSVGHHPFKHDVFL 
LEKEKKLDIMKTATQ j 


6719 


1 


691 I 


ptrpeeqdredgkchkmemnpxsgnLncdpiamsqcssdhgcet 

DLDSDDDKIEKPNNFMKDSASQDNGLSRKISRKRVCSSDSDSSL 
QWKKSSKARTGLLRITRRCAATAANKIKLMSDVEDVSLENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDS EGSTKVLS QALNGDSDS EDMLNS EHKHRHTN I H KI DAP S K 
RKSSSVTSSG 


6720 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPE'PGKWjEiLTGYEAAH 
VP I TEKSNPLTQDLDKADAEN I VRLLGQCDAE I FQE EGQALS T Y 
QRLYSESILTTP4VQVAGKVQEVLKEPDGGI.VVLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPL YT YL I AGGDRS WAS REGTEDS ALHG 
I EEL KKVAAGKKRVI VIG I S VGLSAP FVAGQMDCCMNNTAVFL P 

VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQITSLLFSM 
SWTLISE * ) 


6721 


3 - 


822 


HEVAEEAGGTVYPQRGTMPGTKRyQHVIETPEP^KWELTGYEAA ' 
VPITEKSNPLTQDLDKADAENTVRLLGQCDAEIFQEEGQALSTY 
QRLYSESILTTMVQVAGKVQEVLKBPDGGLVVLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
IEELKKVAAGKKRVI VIGI S VGLSAP FVAGQMDCCMNNTAVFL P 

VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYOITSLLFSM 
SWTLISE J 


6722 
672 3 


1 


390 


i^wskrtwqalpmavlflllflcgtpqaadnmqaiyvalgeawH 

LPCP SPSTLHGDEHLSWFCS PAAGSFTTLVAQ VQVGRPAPDPGK 
PGRE S RLRLLGNYS L WLEGS KEEDAGRY W CAVLG QHHNYQNW | 




173 


659 

< 


VCQYCTARMADFGISAGQFVAWWDKSSPVEALKGLVDKLQALT ~j 
3NEGRVS VENI KQLLQSAHKESS FDI I LSGLVPGSTTLHSAE IL 
\E I AR I LR PGGCLFLKE PVETAVDNNS KVKTAS KLCS ALTLS GL j 
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SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 

nucleotide 

location 

*-v«fi i. c&£jtjji airly 

to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide"" 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenyl alanine, G*=Glycine, 
H^Histidxne, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan / Y=Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6724 
6725 


173 




VEVKELQREPLTPEEVQSVREHLGHESDNL 

VCQYCTARMADFG I S AGQFVAWWDKSS PVEALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AHKE S S FD I ILSGL VPGSTTLHS AE I L 
AE I ARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
VEVKELQREPLTPEEVQSVREHLGHESDNL 


6726 


356 


722 


RRRTP P V J. LATMDDDLMLALRLQE E WNLQEAERDHAQES LSLVD 
ASWELVDPTPDLQALFVQFNDQFFWGQLEAVEVKWSVRMTLCAG 
I CS YEGKGGMCS IRLS E PLLKLR PRKDL VE VFFV 


6727 


98 — 


714 


H L Q KMERK INRR EKE KE YEGKHNS 1»E DTDQG KNCKSTLMT LNVG 
GYLYITQKQTLTKYPDTFLEGIVNGKILCPFDADGHYFIDRDGL 
LFRHVLNFLRNGELLLPEGFRENQLLAQEAEFFX2LKGLAEEVKS 
RWEKEQLTPRETTFLE ITDNHDRSQGLRIFCNAPDFIS KI KSRI 
VLVSKSRLDGFPEEFS ISSNI IQFKYFIK 


6728 


1 


831 


FKGMGDERPH Y YGKHGT PQKYD PTFKGP I YNRC?CTDI I CCVFLL 
LAIVGYVAVGIIAWTHGDPRKVIYPTDSRGEFCX3QKGTKNENKP 
YLFYFNIVKCASPLVLLEFQCPTPQICVEKCPDRYLTYLNARSS 
RDFE YYKQFCVPGFKNNKGVAEVLRDGDCPAVLI PSKPLARRCF 
PA IHA YKG VLMVGNETTYEDGHGS RKN I TDL VEGAKKANG VLEA 

RQIAf^IFEDYTVSWYWDIISLGlAMAMSLLFIILLRFLAGIMG 
RGMIIMGILVLGY 


6729 


486 


935 


FCS S WLRS li ADS S LS W KM FL VGLTGG I ASGKfJ S V tQVFQQLGGA 

VI DVD VMARHVVQ PGYPAHRRI VEVFGTEVLLENGDINRKVLGD 

LIFNQPDRRQLLNAITHPEIRKEMMKETFKYFLREPRTSPRGKK 
H VPS ALKEADS LMRRDT 


6730 


259 


1191 


Vt» JjTGAQSGktaSMGRDQRAVAGPALRR WLLLG TVTVGFLAQS V 

LAGVKKFDVPCGGRDCSGG CQCY PBKGGRGQ PG PVG PQG YNG P P 

GLQG F PGLQGRKGDKGERGAPGVTGP KGDVGARG VSG FPGADG I 

PGHPGQGGPRGRPGYDGCNGTQGDSGPQGPPGSEGFTGPPGPQG 

PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 

GP VGAPGRPG P PGPPG PKGQQGNRGLG FYG VKG E KGDVGQ PG PN 

GIPSDTLHPIIAPTGVTFHPDQYKGEKGSEGEPGIRGISLKGEE 
GIM 


6731 


784 


1015 - 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEA YBVLSNDEKRD I YDKYGTEGLNEF 


6732 


1 


446 


yiRKRLHGAVVPRVEVGCPWETRESEGVHLERPTSPLKNNDEGS 

LDIYAGLDSAVSDSASKSCVPSRNCLDLYEEILTEEGTAKEATY 

NDLQVEYGKCQLQMKELMKKFKEIQTQNFSLINENQSLKKNISA 
LI KTAR VE INRKDE E I 


*"6733 


102 


1205 


GRWQRRPPPPSPPLWCLQPGGGSDPQQLTQLRHCLSHSPQDTPW 

AQRQ VC YTAAT TQAAAPATRNCLP DHSGHRPTP PRS HRHHRQEN 

LGS IKPSSRSTKATSTTMAGDGRRAEAVRBGWGVYVTPRAPIRE 

GRGRLAPQWGGSSDAPAYRTPPSRQGRREVRFSDEPPSVYGDFE 

PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 

PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 

DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 

YEATSVQQKVNFSEEGETEBDDQDSSHSSVTTVKARSRDSDESG 
DKTrRSSSQYIESFW 


6734 


613 


1311 

. 


tea LKg vum K SRNQUG ESAS DGH ISC PKPS 1 1 GNAGE KS LS E DAK 
lOCKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLl 
2 LLS I MEGELQ ARED VIH MLKTE KTKPEVLEAHYGS AE PE KVLR 
7LHRDAI LAQEKS IGEDVYEKP X SELDRLE EKQKET YRRMLEQL 
LiLAEKCHRR TVYE LEW E KHKHTD YMWKSDD FTNLLEQ ERE RL K K 
jLEQEKAYQARKE 




189 


551 £ 


>AAMfc PVFSGUFqELQEKNKSLELVSFBBVAVHFTWE'ErfQDLDD 
^QRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPWIVBE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 

amino ar>i'^ 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P -Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, ./-possible nucleotide deletion, 
\«=possible nucleotide insertion) 


6735 


280 


5S8 


TLNLRLSGGSKKQVFSGICHRSLVSLQEVHLV " 

KSRRAGViKMSNPFIiKQVFNKDKTKRPKRKFEPGTQRPELHKKA 
QAS LNAGLDLRLAVQL P PGEDLNDWVAVHWDFFNRVNL I YGT I 
XDGCT 


6736 


195 


808 


MNYELNFKKEMPNIKSLGLTNLNFLLKRLSSVLPLITDYVYFEN 
SSSNPYLIRRIEELNKTASGNVEAKVVCFYRRRDISNTLIMLAD 
KHAKEIEBESETTVEADLTDKQKHQIiKHREl,FLSRQYESL»PATH 
IRGKCSVALIiNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
E IRVGPR YQAD I PEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFs PGDI VRPSCCVSSS PKLRRNAHSRLES YRPDTDLS 
REDTGCNLQHISDRENIDDIiNMEFNPSDHPRASTI FLS KSQTDV 
REKRXSLFlNHHPPGQIARKYSSCSXIFIiDDSTVSQPNLKYTIK 
CVALAI YYH I KNRD PDGRMLLD I FDENLHPLS KSE VP PDYDKHN 
PEQKQIYRFVRTLFSAAQLTAECAIVTLVYIiERLLTYAEIDICP 
ANWKR I VLGAI LLASKVWDDQAVWNVDYCQI LKDITVEDMNELE 
RQFLELIiQFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 

rahkxeaisri^cedkykdlrrsarkrsasadnltlprwspaiis 


6738 


148 


653 


CACAEQPakaeVGAATALPVRWASGEMAPSGSIjAVPIiAVLVIjLIi 
WGAPWTHGRRSNVRV1 TDENWRELLEGDWMI EFYAPWCPACQNL 
QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 

kdgefrryqgprtkkdfinfisdkewksiepvsswf 


6739 


3 


631 


SWPDMAEEEVAKiEKHLMIaliRQBYVKI^KKLAETklkftCALLAAQ 
ANKESSSESFISRLIAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VLAARSDSWSIJuVLSSTKELDLSDANrpEVTMTMI.RWIYTDELEF 
REDD VFLTE LM KLANR FQLQLLRERCE KGVMS L VNVRNC I R F YQ 
TAB ELNAS TLMNYCAE I IAS HWVS EVEG VNKAL 


6740 


3 


631 


S W PD MAEEB VAKLE KHLMLLRQE YVKXtQKKLAETE KRCALLAAQ 
ANKESSSESFlSRIitAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VLAARS DS WSLANLS STKE LD LSDAN PE VTMTMLRWI YTDELEF 
REDDVFLTE LMKLANR FQLQLLRERCEKGVMS LVNVRNC I RF YQ 
TAEE LNAS TLMNYCAE 1 1 ASH WVSEVEG VNKAL 


6 741 
6742 


141 


960 


PLTL P fssraraghtmnts pgtvgsdp vtlatagydhtvrf wqa 

HSG I CTRTVQHQDS QVNALE VTPDRSM I AAAVQP VS LG YQH I RM 

ydlnsnnpnpi isydgvnkniasvgfhedgrwmytggedctari 
wdlrsrnlqcqrifqvnapincvclhpnqaelivgdqsgaihiw 
dlktdhneqlipepevsitsahidpdasymaavnstlvpfscll 

PIiAIG 1 LQEGEFESliARRGLLFrjACQGNCYVWNI*XGGIGDEVTQ 
LIPKTKIP 


6743 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HS G I CTRWQHQDS QVNALE VTPDRSM I AAAVQP VSLG YQH I RM 

ydlnsnnpnpiisydgvnkwiasvgfhedgrwmytggedctari 

WDLRSRNLQCQR 1 FQVNAP INCV CLH PNQAEL 1 VGDQS G A 1 H 1 W 

dlktdhneqlipepevsitsahidpdasymaavnstlvpfscll 
plaigilqegefeslarrgllflacqgncyvwnltggigdevto 

LIPKTKIP ^vnwuiwujujiviy 


6744 


1 


412 


MHSTQDKSLhmGDPNPSAAPTSTCAPRKMPKRISlSKQLASVK 

alrkcsdlekaiattalifrnssdsdgklekaiakdllqtqfrn 
faegqetkpkyreilseldehtenkldfedfmilllsitvmsdl 

LQNIR ' 




95 


1343 

i 
1 


RTPARNR CAGce VliS R FSSPNKAS S FALQS AGGGL PA VRALR RD 

rqkvstvgygmdeveqdqhbarlkelfdsfdttgtgslgqeelt 
dlchmlsleevapvlqqtllqdnllgrvhfdqfkealililsrt 

tiSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQESVEEFPE 

/T viepldeearpshi pagdcs ehwktqrsee YEAEGQLRFWNP 

>DLNASQSGSSPPQDWI EEKLQEVCEPLGITRDGHLNRKKLVS 1 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor respondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine , XsLysine, " 
L»Leucine, M»Methionine, N=Asparagine, 
P^Proline, Q=Glutaraine, R=Arginine, 
S^Serine, T«Threonine ( V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CEQYGLQNVDGSMIiEEVFHNLDPDGTMSVEDFFYGLFKNGKSLT 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERILDTWQEEGIENSQBILKALDFGLDGNINLTEL 
TLALENELLVTKNS IHQACI 


6745 


l 


588 


TFRDQGWAQRRRWLIiGCASWESWEAAIAAGPGIjPSSTARQQNNP " 
AAGTECFAAVWARGTAMGSVLSTDSGXSAPASATARALERRRDP 
ELPVTS FDCAVCLEVLHQPVRTRCGHVFCRSCIATSIiKNNKWTC 
PYCRAYLPSEGVPATDVAKRMKSEYKNCAECDTLVCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


""6746 - 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFGTTEI " 

SLWTWAAIQAVEKKMESQAARLQSLEGRTGTAEKKLADCEKMA 

VEFGNQLEGKWAVLGTLLQEYGLLQRRIiENVENIiLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEELGIjLDIiAQRKLYRDVMLEWFRNLLSVGH " 
QPFHRDTFHFIjREEKFWMMDIATQREGNSVYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAWFTEEELGLLDPAQRKiiYRDVMLENFRNL 
LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 

vpeagpheewscqqiweqiasdltrsqnsirnssqffkegdvpc 

Q I EARLS i sxvqqxp YRCNECKQ 


6749 


95 


719 


rrevkggdgvcprargspqsqqfpscagggeglqqsgealdgam " 

SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWXiEVLEKEFDKAF 
VD VDIjLLGE I DPDQAD I T YEGRQ KMTSLSSC FAQLCHKAQS VSQ 
INHKIiEAQIiVDLKSELTETQAEKWLEKEVHDQLLQLHSXQLQL 
HAKTGQSADSGTI KAKLSGPSVEELERELKAN 


67S0 


3 


428 


scesrrpgakv^vwasgalprdttglgseqpsgdvaqsnratmgt " 
tapgpihllelcdqklmeflcnmdnkdlvwleeiqeeaermftr 

BFSKEPELMPKTPSQKNRRKKRRISYVQDENRDPIRRRLSRRKS 
RSSQLSSRR 


6751 


152 


1417 


PTICATEMAGASVKVAVRVRPFNSREMSRDSKCIlQRSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEG YNVCI FAYGQTGAGKS YTMMGKQEXDQQGI I PQLCEDL 
FSRINDTTNDNMSYSVEVSYMEIYCERVRDLLNPKNKGNlaRVRE 
HPLLG P YVEDL S KIAVTS YNDI QDLMDSGNKARTVAATNMNETS 
SRSHAVFNIIFTQKRHDAETNITTEKVSKISLVDLAGSERADST 
GAKGTRLKEGANINKSLTTLGKVISAIiAEMDSGPNKNKKXKKTD 
FIPYRDSVLTWIjLRENLGGNSRTAMVAALSPADINYDETLSTLR 
YADRAKQ I RCNAVINEDPNNKL IRE LKDEVTRLRDLLYAQGLGD 
ITDMTWALVGMS PSSSLSALSSRNV 


6752 


24 


1834 


RNCVPPLGCYRSRVKFHSDIKMQYSHHCEHLiERIiNKQREAGFL " 
CTCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 
VKADGFQKLLEFIYTGTLNLDSWNVKEIHQAADYLKVEEVVTKC 
KIKMEDFAFIANPSSTEISS ITGNIELNQQTCIiLTLRDYNNREK 
SEVSTDLIQANPKQGALAKKSSQTKKXKKAFNSPKTGQNKTVQY 
PSDILENASVELFLDANKbPTPWEOVAQINDNSELELTSWEN 
TFPAQD I VHTVTVKRKRGKSQPNCALKEHSMSNI ASVKS P YEAE 
NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPYV 
CHLCG KAFTQCNQLKTHVRTHTGEKPYKCELCDKG FAQKCQL VF 
HSRMHHGEEKPYKCDVCNLCFATSSNLKIHARKHSGEKPYVCDR 
CX3QRFAQASTLT YHVRRHTGE KP YVCDT CGKAFAVS SS I» I THS R 
KHTGEKPFICELCGNSYTDIKNLKKHKTKVHSGADKTLDSSAED 
HTLS EQDS I Q K S PLS ETMDVKPS DMTL PLALPLGTEDHHMLLP V 
TDTQS PTSDTLLRS T VNG YS E P Q L I FLQQLY 


6753 


2 


13 05 


VPSLPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAiS ' 
PFGIKLRRTNYSLlRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEMEGVALKHGPSLPQERKQAPS TRRDSAEPSSSRSVP V 
AHPGPPPASSQTPAPEHDKAAWKMPLAQKPALAPKPTSQTPPAS 
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SEQ 
ID 
NO: 
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beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptic[e~~ 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F~ Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T«Threonine, VsValine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6754 






PIjSKIiSRP YLVELLSRRAGRPDPE pseps kedqessdrrpps p p 
GPEERKGQKRDEEEEATBRKPASPPLPATQQEKPSQTPEAGRKE 
KPMLQSRHSLDGSKLTEKVETAQPLWITIALQKQKGFREQQATR 
EERKQ AREAKQAEKLS KENVSVS VQPGSS SVS RAGS LH KSTALP 
EEKRPETAVSRLERREQLKKANTLPTSVTVE ISYSS PAAPLVKE 
_ VSKRFSSPDDAPVSSEPAWIALAKRKAKAWSDCPLTTy 


6755 


2 


4 13 ~ 


F VRRRRRRLGG PE VNTM S S LHKSR IAD FQDVliKE PS I ALEKItRE 
LS PSG I PCEGGLRCLCWKI LLN YLPLERASWTS I LAKQRELYAQ 

FLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVI.L 


6756 


298 


1343 


FPI.QLQVALEADWFLDMPGGRRGPSRQQLSR^ALPSLQTLVGGG " 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LL FE FLFF I YLL VALFI Q Y INI YKTVWW YPYNHPAS CTS LNFHL 
I D YHLAAF I T VMLARRL VWAL I SEATKAGAAS M I H YMVL I S ARL 
VLLTLCGWVLCWTLVNLFRSHSVLNIjIiFLGYPFGVYVPLCCFHQ I 
DSRAHLLLTDYNYWQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQFNUATPI PTHSCPLS PDLIRNEVECLKADFNHRIKEVLFNS 
LFSAYYVAFLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 


6757 


180 


754 


ifclKAMSSLPl^ I PVSWGSLRTLKYQQQPLRPKVLLCQTRVQCHD 1 

LRSLQPQPPGLKQSFCLRVLGLQTGATTPGLRDLTCKELIILTE 

REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 

KALYWDVMDENYRNJCVFLGKDNFALEVKICPRVFLYFLCCLSWE 
PFHYLTETEALLTHK 


6758 


2 




I^SRVEAPEAHSRESQGSDAMRKHLSWWWLAWOriLLFSHLSAVQ "1 
TRG I KHR I KWNR KALPSTAQ I TEAQ VAENRPGAF I KQGR KLD ID 
FGAEGNR YYEAN YWQFPDG I HYNGCSEANVTKEAFVTGC I NATO 
AANQGE FQ KPDNKL HQQVI»W J 


6759 


1 


inno 


ASGPEIjPGRRFRDRAPWIiPARLLRGVLAVWVSltSAltGPGSFCRR 
RVPSIAQI,GHSBAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPSFRRNMANNSPALTGNSQPQHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMILTNILSSPYFKVQLYELK 
TYHE WDE I Y FKVTHVE P WE KGSRKTAGQTGMCGGVRG VGTGG I 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
QPPTDLWDMFESFLDDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRI PVPVQKNIDQQI KTRPRKI 1 


6760 


1 


513 


KKHNFHSLDGTSTRAFHPQTGI;PLLSSPVPQRKTQSGCFDLDSS" 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEESVLNYRFDPLGIVDGFTAEVGASGAFCPTHLTLPVEVSFY 
SVSDDNAPSPYMGVITLESLGKRGYRVPPSGTIQWCVL 




6761 


239 


606 


VJjSKKKGLSAEEKRTRMMEIFSETKDVFQLKDIjEKIAPKEKGIT 
AMSVKEVLQSLVDDGMVDCERIGTSNYYWAFPSKALHARKHKLE 
VLESQLSEGSQKHASLQKSIEKAKIGRCETEERT 






29 


1733 

1 
1 
i 
I 


ERTL,RGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS 
S 5 S S VQRCELS L FQ S LHTMTS KK LVNS VAGCADDALAG LVACNP 

NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGS I IAA I RAVAQAGTVGTLL I VKNYTGD 
RUf FGLAREQARAEGI P VEMWI GDDSA FTVLKKAGRRGIjCGT V 
L I HKVAGALAEAGVGLEE I AKQVNWTKAMGTLG VSLSSCS VPG 
SKPTFELS ADE VELGLG IHGEAG VRR I KMATADE I VKLMLDHMT 
^TTNASHVP VQPGSS WMMVWWLGGLS FLELGI IADATVRSJLEG 
*G VKIARAL VGTFMS ALEMPG I S LTIiLLVDE PLLKL I DAE TTAA 
WPNVAAVS ITGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
]RVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
'PPASPAQLLSKLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKT 
JLPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQEL 
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ID 
NO: 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 

Glutamic Acid. FnPhenvlalflninA irs~ -j 

H«Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=G1 ut amine , R^Arginine, 
S= Serine, T=Threonine, V= Valine, 
W=*Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide dpioi*4riii 
\«possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWLSLFI 
QVAFITLAVAAGLYYTiAKTiTRF YTVaTQT^T T wvrxmirc»T.T\tr-r -r/-. 

LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LVWNHYIAFQFFABE YYPFSEVLAYFTFCLWI IPFAFFVSLSA 
GE1WLPSTMQPGDDVVSNYFTKGKRGK 


6763 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMEU3GHWDMNSAPRLV3ETAE 

RKOEOKTGTEA1? AAnCfIA VrtJiDP t?T.T r>r vrnnciT rvr nniT«..,».« 

a "fvortfwovjttvvTrtJtcrcr ±j±j x XiGQFJjDIjFG V SM WP 
LLS LH VKS LGAS PTVAG I VGSS YG I LQLFSSTLVGCWS D WGRR 
SS LLACI LLSALGYL LLGAATNVFIj FVLAR VPAG I FKHTI*S I SK 
ALLSDWPEKERPLVIGHFNTASGVGFTLGPWGGYT.TELEDGF 
YLTAF I CFLVFI LNAGIiVWFFPRREAKPGSTE 


6764 


80 


438 


LKKMDTMMLSVRNLFEQLVRRVEILSEGNEVQFIQLAKDFEDFR 
KKWQRTDHBLGKYKDLl^KAETERSAJLDVKLKHARNQVDVEIKR 
RQRAE ADCEKL E RQ IQI, I REMLMCDTSGS I Q 


676S 
<*76£ 


3 


550 


ARYSRVDHFCRRRO^VARAPRFLLQFPSGPSRHFLAACVARWlT' 
RGS VL VSEALS GSAKDG I VTE VAVGVKRGSDE LL SGSVhSS PNS 
«i»idc>n v \ri AN^JNJUSKKFKGEDKMDGAPSRVLHIRKLPGEVTETE | 

VIALGLPFGKVTNIIxMLKGKNQAFLELATEEAAITNGNYYSAVT ■ 
PHLRNQ 1 


6767 


1 


1287 


EGGS FKASLTWLWPLGEMKLHCE VE VI SRHIiPAIiGLRNRGKGVR ' 
AVLSliCQQTSRSQPPVRAFLLISTLKDKRGTRYEIiRENIEQFFT I 
KFVDEGKATVRIiKEPPVDICLSKANSSSLKGFLSAMRLAHRGCN j 
VDTPVSTLTPVKTSEFEWFKTKMVITSKKDYPIjSKNFPYSLEHL 1 
QTSYCGLVRVDMRMLCLKSLRKbDLSHNHIKKLPATIGDLIHLQ 
ELN LN DNHL E S FS VALCHS TLQKS L WS LDLS KNK I KAI*P VQ FCQ 

i^elknlkijddneliqfpckigqlini^flsaarnklpflpsef 

RNLS LEYLDLFGNTFEQPKVLPVI KLQAPLTLLESS ARTI LHNR 
IPYGSHIIPFHLCQDLDTAKICVCGRFCLNSFIQGTTTMNLHSV 
AH TWL VDNIK3 GT EAP 1 1 S YFCSLGCYVNSSDI 


£7^8 . 


336 


919 


APMICIiCSSDLQFRYKEAFLRDRGLQlGYCSVDDDPRMKHFLNV 
GRLQSDNEYKKDFAKSRSQFHSSTDQPGLLQAKRSQQLASDVHY 
RQPLPQPTCDPEQLGLElHAOKAH0LOSDVXYK<5nT,Mr,TT?r , x;r»ui'r 

PPGSYKVEMARRAAFXANARGLGLQGAYRGAEAVEAGDHQSGEV 
MPDATE I LHVKKKKALLL 


6759 


2 
284 


363 
396 


PGS TI 3 C YI*IiSEGSLPLCMQ VACGEEKHRAPTMKTLRAR FKKTE 
LRLSPTDLGSCPPCGPCP1PKPAARGRRQSQDWGKSDERLLQAV 
ENNDAPRVAAI» I ARKGLVPT KLDPEG KS AFHlr 


6770 
6771 "'* 


1 


3$7 


MSTPDFSTA^NNQELiANEVSCIiKAMLTLMLQAMGQAD 1 

URWYQVIWSSTMAKLHDYYKDEVVKKI^^fiFKYNSVMQVPRVEK 

itlnmgvgeaxadkklldnaaadlaaisgqkplitkarksvagf 

KIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAKS 


6772 


3 




3 78 


APAG TLAM TG KS VKDVDR YQAVLANLLLBEDNKFCADCQSKG PR 
WAS WNIGVFICI RCAGIHRNLGVHI SR VKS VNLDQWTQEQ I QCM 
QEMGNGKANRL YEA YLPET FRRPQ IDP YLF WSNLEG 






1400 

< 


AAAFLQGMTVlVGFINTVITSL\ERRYDLHS^QSGLIASSYDIAA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P**GWKLDAGVRTCPANPR\PVCAG\HTSGI>SRYQ1jVFMLGQFI* 
HGVGATPLYTLGVT YLDENVKS SCSP I YIAI FYTAAI LGPAAG Y 
b I GGALLN I YTEMGRRTELTTESPLWVGAWWVGFLGSGAAAFFT 
AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGKT 
IRDLPLSIWLLLKNPTFILLCLAGATEATIiITGMSTFSPKFLES 
3FSLSASEAATLFGYLWPAGGGGTFLGGFFVNKLRLRGSAVIK 
FCLFCTWS LLG IL VFSLHC P S VPMAGVTAS YGGSLL PEGHLNL 
rAPCNAACSCQPEHYSPVCXSSDGLMYFSLCHAGCPAATETNVDG 
5KVYRDCSCIPONLSSGFGHATAGKCTST 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


Ammo acid segment containing signal peptide""*! 
{A=»Alanine, C^Cysteine, D=Aspartic Acid, E= 1 
Glutamic Acid. F=Phpnvl alanlna f—m i 

H=Histidine, I=Isoleucine, K=Lysine, | 
L=Leucine, MeMethionine, NssAsparagine 1 
P-Proline, Q*=Glutamine, R=Arginine, ' j 
S=Serine, T=Threonine, V-Valine, j 
W« Tryptophan, Y=Tyrosine, X»Unknown, *^Stop 
Codon, / ^possible nucleotide deletion 
\=possible nucleotide insertion) 


6773 


1 


630 


P WE APKB HKYKAEEHTWLT VTGE PCHFP FQ YHRQLYHKCTHKG 
R PG PQ PW CATTPNFDQDQRWG Y CLEP KKVKDHCSKHS P CQKGGT 
CVNMPSGPHCLCPQHIjTGNHCQKEKCFEPQLLRFFHKNEIMYRT 
EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEGHRIi 
CHCP VGYTGPFCDVGE *GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTEliSDQQYFLFFILS S / WVPTFLSMDVDGRVIKADS FSKI TSS~] 
GLRIGFLTGPKPLIERVILKIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


r * 3WiJX% v u 1 -HKvVjKKAfbPy IjW IbVLALxESKWRSHRI LRMNS 
GRPETMENLPALYTIFQGEVAMVTDYGAFIKIPGCRKQGLVHRT 

hmsscrvdkpseivdvgdkvwvkligremkndrikvslsmkwn 
. QQtgkdldpnnvXslskkrgggdpsritlgrrsplrls 


6776 


3 


1108 


HERHERHEGALSQDALLRIS i pldsnmrpekcrrfvhpqwqllh i 
j LNGTFPKTSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
-to v/*ivr vcwH^iYawivoGXlx^HIjSDRFGRRFVIjRWCYLQVAIVGT 1 
CAAI^ITLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 
GITLGMCPSGIAFMTLAGLAFAIRDWH1LQLWSVPYFVIFLTS 
SWLLESARWLI INNKPEEGLKELRKAAHRSGMKNARDTLTLEIL 1 
1x0 A"^U£<AAW^K-Pr IjQERIiHMPNICKRISLIjPFTKFAlJFKA [ 
YFGLNLHG/ LKHLGNNVFIjIjQTLFGAV/TPPGQLVIjHLGHWGSG 
RVSS RGRVKCLGLFVCQVW 


6777 
6778 


779 


63 


cffhgpawrdcevratfakkqgqsgiisciafspaqplyaotstH 

GRSLGIjYAWDIX?SPLA1jIjGGHQGGITHLCFHPDGNRFFSGARKD 
j-wB.jj*ji_«ujjity^i < ,x ^IjWoI^kEVTXNQRI YFDIiDPTGQFliVSGST I 
SGAVS VWDTDGPGNDGKPEPVLS FLPQKDCTNGVSLHPSLPLLG 

hclpvsvcflsptesggrrrgagpslgsprrhvhlecrlqlwwc 
gggarlqhp**sprarkgr 1 




311 


805 


j-v=»j- A ^»oK^oXKKiy>IFANTRLRIiNVP\EBTAGDSE/ERSPEEE j 

vqadprirsaspkcptsspfpkgrspegeget\dpekvhfhpgp 
kdksvaekw\kgp\spvssegikdffsmkpewenlnqsnvrrmh 

T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY I 


6779 
6780 


2 


535 " 


KALRRQPRljlJU^GIEPESMAISEPIKGSRKPCVNKEELALKKP^ j 
MAKCAWKGPREPPQDARAEAESPGGASESDQDGGHESPPKKKAV 
AW VSAKN PAPMRKKKKVS LGP VS YVL VD S EDG R KKPVM P KKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 




3 


403 


Hfa VNDNKPE ININIiKS PGKEEIS YI PEGDP IDTFVALVRVQDKD 

SGLNGEIVCKLHGHGHFKIjOKTYT'MIJVT TTT»MATT r\r>irirYio.r>*r«-» 1 

ltviaedrgtpslstvkhftvqindindnpphfqrsryefvise 

K ! 


6731 
6782 


1 


1269 


APTRPVFPTLgDl,SSSKEPSNSLNl,PHSNELCSSLVHPELSEV5 
SNVA PS I PP VMS RP VS S S S I STP1»P PNQI T VF VT SNPITTS ANT 

SAALPTHIX3SALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPVFINSSSIIQVMKpSQPSTIPAAPLrTNSGLMPPSVAVVGPL 
HlPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQIiP 
SPPCTSSPWPSHPPVQQVKELNPDEASPQVWTSADQNTLPSSQ 
STTMVSPLLTNS PGS5GNRRSPVSS S KGKGKVDKIGQI I»LTKAC 

KKVTGSIiEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTPTPP APTLLKT4TSSP VGPGTASAGPSLPGGALPTS VRS IVTT 
LVPSELI SAVPTTKSNHGG IASESLAG 




3 


1327 

1 
i 
1 
I 
I 


KKPTVIRIPAKPGKCLHEDP<jSPPPLPAEKPIGNTFSTVSGKLS I 

WERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTI PTQQPPTK 

/PPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 

<AKSQVFKNQDPV1 I PPRPKPGHPLYSKYMLSVPHGIANED1VSQ 

IPGELSCKRGDVLVMLKQTENNYLECQKGEDTGRVHLSQMKLXT 

^DEHLRSRPWPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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Amino acid segment containing signal peptide 
(A«Alanine, C=» Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine f 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








DDLNLTSGEIVYHjEKIDTDWYRGNCRNQIGI fpanyvkvi I di 
PEGGNGKRECVSSHCVKGSRCVARFEY IGEQKDELSFSEGEI 1 1 
1 LKEYVNEEWARGE VRGRTG I FPLNFVE PVEDYPTSGANVLS TKV 

PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


S YHHHHAQQS AAAS PNliTAS QKTVTTTS M I TTKTLPLVIjKAATA 
TMPAS WGQRPTIAMVTAINSQKAVLS TDVQNTPVNLQTSS ICVT 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVR P KP VAQNNI P 1 A PAP PPMLAA PQLIQR P VMI/T KFTPTTI, 
PTSQNSIHPVRWNGQTATIAKTFPMAQLTSIVIATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGIiVTHDHLEEIQSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTS PQS 
SH PDS PENEKTETTFTFP AP VQP VS L PS PTSTDGDI H ED FCS VC 
RKSGQIiliMCDTCSRVYHLDCLDPPLKTIPKGMWlCPRCXJDQMLK 
KEEAI PWPGTLAI VHSYI AYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS I S KCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPL.VLKAATA 
! TMPAS WGQRPTIAMVTAINSQKAVLSTDVONTPVNT^yr<5 qicvt* 
GPGAEAVQ I VAKNT VTLQVQATPPQP I KVPQFI PPPRLTPRPNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS I HPVRWNGQTAT I AKTFPMAQLTS I VIATPGTRLAGP 
QWQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVPE 
PERKKS AVTYLNS TMHPGTRKRGR P PK YNAVLGFGALTPTSPQ S 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMW1CPRCQDQMLK 
KEEAI PWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQ KVKQ LSNS I S KCMEMKNTI LARQKEMHS SLEKV KQL IRL I H 
GXDLtS KPVDS EATVGAISNG PDCTPPANAATS TP AP S P SSQS CT 
ANCNQGEETK 


6785 


1 


528 


J^NTVLHYCSMYSKPECLKLLIiRSKPTVDIVNQAGETAIjDIAKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKIALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


KSPKVLVLAPTRELANHVSRDFKDI\TRKLTVARFYGGTSYQSQ 
INHIRNGIDILVGTPGRIKDHI^SGRLDLSKLRHVVLDEVDQML 
DLGFAEQVEDI I HE S YKTDS EDNPQTLLFS ATCPQWVYTVA\KK 
YMKSRYEQVDLDGKMTQKAATTVEHLAIQCHWSQRPAVIGDVLQ 
VYSGSEGRAIIFCETKKNVTEMAMNPHIKQNAQCLHGDIAQSQR 
E I TLKGFREGS PKVLVATNVAARGLDI PEVDLVIQSS PPQDVES 
YIHRSGRTGRAGRTGICICFYQPRERGQLRYVEQKAGITFKRVG 

VPSTMr^TiVTf^lf QMT">2V TOOT ACircvMtimcitinrKNKnnT 

v *» w vx ^^^ w "AJLKbJLiAi>V5YAAvDFFRPSAQRLIEEKGAV 
DALAAALAH I SGAS S FE PR SL 1 TSDKGFVTMTLESLE E I QD VS C 
AWKELNRKLS SNAVS Q I TRMCLLKGNMG VCFDV PTTES ERLQAE 
WHDS DW I LS VPAKLP EI EEY YDGNTSSNSRQR SG WS S GR SGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD * VF YHLVD FLSDFLVDS VY1/TGRQ I DRLTGLTGL I DH LTS HS 
SVWN 


6787 


2646 


2270 


PSSFPKNVPLEEI,EEPPK*KRSGLG6LTPKSQIQNGP*PQTFFF 
FELGSPSGVISAHCNLRLLGSSDSPAPASRVAGIIGTCHHAWLI 
L VFL VBMG FHHVGQAGLKLI*TL \ VI H P P WP P KVIiGLQT 


6788 


16 


936 < 
1 1 


SGTVDLRXDMIAVSVIiAAVRGGR/ATVRRVRESNVLHEKSKGKT 
1EGAEDKMTSGDVL SNRKM F YLL KTAFPS VQ INTE EHVD \ ELDQ 
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Amino acid segment containing signal peptide 
<A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G*Glycine, 
H^Histidine, 3>Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S^Serine, ^Threonine, V=Valin e/ 
^Tryptophan, Y~Tyrosine, X~Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6789 






EVILV7GS*DS*GYPKGK*LLPKEVPSR/RVLLSGLTPLDATQE\" ' ' 
FTEDL S K\ YVTTMVC VAVNG KPMLGVI HKP FS E YTAWAM VDGGS 
NV KARS S YNE KTPR I WSRS HS GMVKQ VALQT FGNQTTI I PAGG 
AGYKVLALIJ3VPDKSQEKADLYIHVTYIKKWDICAGNAILKALG 
GHMTTLSGEE I S YTGSDG 1 EGGLLAS I RMNHQALVRKLPDLE KT 
j GHK 


6790 


2 


678 


GNG INVLiKI APESAIKFMAY EQ IKRLV W * * PGDS * GF/ YERL VA 
GSLiAGA I AQS S I Y PMEVLKTRMALRKTGQ YSGMLDCARR ILARE 
G VAAF YKG Y VPNMLGI I P YAG I DLAVYE TL KNAWLQHYAVNS AD 
PGVFVLLACGTMSSTCGQLAS YPLALVRTRMQAQAS IEGAPE VT 
MSS liFKHXLRTEGAFGLYRGLAPWFMKVI PAVS IS YVVYENLKI 
I TLGVQSR 


6791 


2 


4068 


APPAGR RRMQAAPRAG CGAALLLW I VS S CLCRAWTAPS Tg QKCD 
EPLVSGLPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQ ISAIATQGRYSSSDWVTQYRMLiYSDTGRNW 
KPYHQDGNIWAFPGNINSDG WRHELQHPI I AR YVR I VPLDWNG 
EGRIGLRIEVYGCSYWADVINFDGHWLPYRFRNKKMKTLKDVI 
ALNFKTSESEGVILHGEGQQGDYITLBLKKAKLVLSLNLGSNQL 
GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMQHFR 
TNGEFDYLDLD YE I TFGG I P FS GKPS S S S RKNFKGCMES INYNG 

| VNITDLARRKKIiEPSNVGNLSFSCVEPYTVPVFFNATSYLEVPG 
RLNQDLFS VS FQFRT WNPNGLLVFSH FADNLGNVE I DL»TE SKVG 
VHINITQTKMSQIDISSGSGLNDGQWHEVRFIAKENFAILTIDG 
DEASAVRTNS PLQ VKTGEKYFFGGFLNQMNNSSHS VLQPS FQGC 
MQLIQVDDQLVNLYEVAQRKPGSFANVS IDMCA2 IDRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNS I YEPSCEAYKHLGQT 
SNYYWIDPDGSGPLGPLKVYCNMTEDKVWTIVSHDLQMQTPVVG 
YNPEKYS VTQLVYSASMDQI SAI TDSABYCEQYVS YFCKMS RLL 
NTPDGSPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
YYCNCDADYKQWRKDAGFLSYKDHLPVSQVWGDTDRQGSEAKL 
SVG PLRCQGDRNYWNAAS FPNPS S YLHFSTFQGETSAD I S FY FK 
T LT PWG VFLENMGKEDF I KLELKS ATE VS FS FD VGNGP VE I WR 
SPTPLNDDQWHRVTAERNVKQASLQVBRLPQQIRKAPTEGHTRL 
EI>YSQLFVGGAGGQQGFLGCIRSLRMNGVTLDLEERAKVTSGFI 
S G CS GHCTS YGTNCENGGKCLER YHG YS CD CSNTAYDGT FCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDLAQ 
EEIRFSFSTTKAPCILLYISSFTTDFLAVLVKPTGSLQIRYNLG 
GTREP YNI D VDHRNMANGQPHS VNI TRHEKT I FLKLDHYPS VS Y 

HbPSSSDTLFNSPKSLFIiGKVIETGKIDQEIHKYNTPGFTGCLS 
RVQFNQIAPLKAALRQTNASAHVHlQGELVESNCGASPLTriSPM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAX IGGVI 
A\WIFTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPIDESKKEWPHLRGGYLAKG 




"6792 


1801 


1193 ~j~ 


TGHEGAKQEKODKGDLGPRGERGQHGPKGBKGYPGIPPEL/PGW 
SAW*SWLTAASTKVQAILLPQPLE*LGLQIAFMASLATHFSWQ 
NSG 1 1 FSSVETNIGNFFDVMTGR FGAPVSGVYFFTFSMMKHEDV 
SEVYVYLMHNGmVFSMYSYEMKGKSDTSSNHAVLKLAKGDEVW 
CiRMGNGALHGDHQRFSTFAGFLLFETK 






33 


1073 i 
] 
( 

: 
] 
c 
/ 
p 


/RHTiWGVDMYLFSLQSESPKGAIGHIVSTEKTILAVERNKVLL 
PPLWNRTFS WGFDDFS CCLGS YGSDK VLMTFENIiAAWGR CLCA V 
:PSPTTIVTSGTSTWCVWELSMTKGRPRGIiRLRQALYGHTQAV 
rCLAASVTFSLLVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 
CTI SDVSGTIVSCAGAHLSLWNVNGQPLAS ITTAWGPEGAITCC 

:lmegpawdtsqhitgsqdgmvrvwkt/vgcedvcswtasrrg 

lPGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQN<3P\PVTAL 
kVS RNHTKLLVGDERGR I FC WSADG » EERGSRG SQTT VPG 
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amino acid 
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Amino acid segment containing signal peptide 
<A-Alanine, c= Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine i N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V» Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 


5733 


2340 


805 


GRKEANY \ YGSLTQAGT VSLGLDAEGQEVFVP FSAVLPMVAPND 
LVFDGWDISSLNLAEAMRRAKVLDWGLQEQLWPHMEAIjRPRPSV 
YIPEFIAANQSARADNLIPGSRAQQLEQIRRDIRDFRSSAGLDK 
VI VLWTANTERFCEVI PGLNDTAENLLRT IELGLEVS PSTLFAV 
AS I LEGCAFLNGS PQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKSVLVDFLIGSGLKTMSlVSYNHIiGNNDGBNLSAPLQFRSKEV 
S KSNVVDDMVQSNPVLYT PGEEPDHCWIKYVPYVGDS KRALDE 
YTSELMLGGTNTLVLHNTCEDSLLAAPlMIxDLAlil.TELCQRVSF 
CTDMDPE PQTFH P VLS LLS FLFKAPLVPPGS P WNALFRQRSC I 
ENILRACVGLPPQNHiVIIiLEHKMERPGPSLKRVGPVAATYPMIjNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRLFLPAAP 
HD P TLKAPTNKGR CHFS P PSTWGS WGL 


6794 


169 


1349 


DDVKRKPEASAH * EKPGP PSRPGVRGGRERAGGRGS HGARS CR \ 
EPAPPAPAP PEDHPDEEMGFTI DI KS FLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRIiFERYGEPSEVFINRDRGFGFIRJjESRTLAE 
IAKAELDGTI LKSRPLRIRFATHGAALTVKNLS PWSNELLEQA 
FSQFGPVEKAVWVDDRGRATGKGFVEFAAKPPARKALEROGDG 
AFLLTTTPRPVI VE PMEQFDDEDGLP E KLMQKTQQ YHKE REQPP 
R FAQ PGTFE FE YAS RWKALDEMEKQQREQVDRNI REAKE KLE AE 
MEAARH EHQLMLMRQDLMRRQEEI*RRUEELRNQE LQKRKQ I QLR 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMEWYVCHFLR 


679S 


1740 


1010 


GPRRQTQ VRDI I ELDS F * D WAAQETDCAQNSG ERIj * KGV/LENFS ' 
TMS KS AVKI S LDLLS NPLCE QDQDIiLNMVTALDTAMKRMDAFNQ 
EKVNQ I QKT VI E PL KKFGS VFPS LNMAVKRREQ ALQD YRR LQAK 

vekyeekektgpvlaklhqareelrpvredfeaknrqlleempr 
fygsrldyfqpsfesliraqwyysemhkifgdlshqldqpghs 
deqrereneaklselrals i vadd 


6796 


48 


683 


GKE IQI PTI KLAWLLFGLE * PVGALGKGWS F+ * SHVALGQLGW " 
LTRAVRSSWRWELCVSAQEWSQRSA*SSPSPVGACPSLNPPET 
SVQEGRDCWQR*LPRLFSALVGQPGCWPQGAPPERCV* PGRCKW 
HLQSQVLR* ERRRCCRCLPRFA* GWRRRHQRLGLG IHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRPSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTLIRWWLLTASRVDPP 
ERPPPPPSDDLTLLESSSSYKNL/nAQIPQ/DWSMSPSTSG*RP 
LTSRASS IMRSRTAIPSAS *SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTETTASGSCLTWWSSSPAPCPSSSAPAHSFEASCCK 
TSLWGSCGGSGDGSSACX3SGWNLSMAGTSCSSPAMCSPSRAPS* 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 
PHCX5WPCPASCAS AAAWLSSTWATAS VAGSCWGP IM* SS AHSPW 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCGSSPSSTFTPSS 
ASSSTWCSASSSRSSPAPTTPSSIPAAQAQRRASCRPTSHSART 
APPPAS S AAGAARPAAFSAAAEGTPRRS I RCW 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGS PRLSALECVLLVPQ\PQ I A 
VRLLAHKIQS PQEWEALQALT YLGDRVS EKVKTKV I E LLYSWTM 
ALPEEAKI KDAYHMLKRQGI VQSDPP I PVDRTLI PS P P PRPKNP 
VFDDEEKSKLLAKLLKSKNPDD1X2EANKLIKSMVREDEARIQKV 
T KRLHTLEE VNNNVRLLSEMLLH YS QEDS S DGDRELMKEL FDQ C 
ENKRRTL FKLASETEDNDNS LGD I LQASDNLS RVI NS Y KT 1 1 EG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTWSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSS S SSQAPLPPPFPAP WPASVP 
AP S AGS S L FS TGVAP ALAPKVB P AV PGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
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Predicted end 
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residue of 
amino acid 
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Amino acid segment containing signal peptide" 
(A=Alanine, OCysteine, D=Aspartic Acid, E=; 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y«Tyroeine, X»Unknown, *=*Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAyDKNGF 
RIIiFHFAKECPPGRPDVLWWSMLNTAPLPVKSlVLQAAVPKS 
MKVKLQPPSGTELS P PS P I QPPAA I TQVMLLANPLKE KVRLR YK 
LTFALGEQLSTEVGEVDQF PPVEQWGNL 


6799 


3894 


1696 


ST I S W ES LES WLNKATNPSNRQEDWE YI I GFCDQ I NKELEG * VS 
AliWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRIjLAHXI Q S PQ EWE ALQALTYLGDRVS E KVKTKV I ELLYS WTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTL1PSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTL FKIAS ETE DNDNSLGD I LQ AS DNLS RVI NS YKT 1 1 EG 
QVI NGE VATLTL PDS EGNSQCSNQGTI#IDLAELDTTNSLSS VIA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAIj 
SWLDEBLLCLGLADPAPNVPPKESAGNSQWHLIiQREQSDIjDFFS 

prpgtaacgasdapllqpsapsssssqaplpppfpapwpasvp 
APSAGSSLFSTGVAPAIAPKVE pavpghhgialgns alhhldal 
dqlleeakvtsglvkpttspliptttparpllpfstgpgsplfq 
plsfqsqgsppkgpelslasihvplesikpssalpvtaydkngf 
rilfhfakecppgrpdvlwwsmlntaplpvksivlqaavpks 

MKVKLQPPSGTELSPFSPIQPPAAITQVMLIiANPLKEKVRLRYK 

i/tfalge qls tevgevdqfpp veqwgnl 


6800 


404 


1646 


RRSPSTGLSPVPQPSS PSbSDYSI pwslllsgtiawatpgk*ag 
*pqaw*lglapaiafi/gltrgrkqnkekmaeggsgdvddagdc 
sgaryndwsdddddsnesksivwyppwarigteagtrararara 

RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 

AALI ALGNNAAYAFNRDI irdlgglp i vaki lntrdp I vkekal 

IVLNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLL 
TNMTVTNE YQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNIiAE 
NPAMTRELLRAQVPS SLG \ S L FNKKBNKE V 1 LKtiL V I FENINDN 

FKWEEl^PTQNOFGEGSLFFFLKEFQVaU>iCVIiGIESHHDFIjVX 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFE SQyAS V TMHDVDAESFE VLVD YC YTGRVSLS EANVERL 
YAAS DMLQLE YVREACAS FLAR RLDLTNCTAI LKFADAFGHRKL 
RSQAQSYIAQNFKQLSHMGSIREETIiADLTLAQLLAVLRLDSLD 
VBSEQTVCHVAVQWLEAAPKERGPSAAEVFKCVRWhJHFTEEDQD 
YLEGIiLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/ R * QQQL S C I CSR KSTPETGYVCQ G DGDLL WTPQRSLS \R YDP Y 
SGDIYTMPSPLTSFAHTKTVTSSAVCVSPDHDIYLAAQPRKDLW 
VYKPAQNS WQQLADRLLCREGMD VA YIiNG YI YI LGGRDP X TGVK 
LKEVECYSVQRNQWALVAPVPHSFYSFELIWQNYLYAVNSKRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDE I YCICDI PVMKVYN 
PARGEWRRISN I PLDS E THNYQ I VNHDQKLLL I TSTTPQW KKNR 
VT VYE YDTREDQW INI GTMLGLLQFDSGF I CLCARVYP S CLE PG 

QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


6B02 
6803 


157 

1 1 


1341 
2203 


ETFPLF FFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE - 
PSTRKNLMNSLEQKIRCLEKQRKELLEVNQQWDQQFRSMKELYE 
RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRORDLT 
RDRLQR EE KE KERLNEELHBL KE ENKLLKG KNTLANKE KEH YE C 
EIKRLNKAIiQDALNIKCSFSEDCLRKSRVEFCHEEMRTEMEVLK 
QQVQ I YE E D FKKERSDRERLNQE KEELQQI N ETSQSQLNRLNS Q 
IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCIAPPPVCCQAG/PR 

TPGLK*SSCLWLPKC*NFRFILSKESPSVEVHTNRERQQATRER 
G 

KL SGRP YRHMG VLGTS KLYDIRKT I FTFTPQF I DQQQFYIiALDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTSLNS ' 
S I LAALRKMQDG YFGGARVQTG KLS E FLTTS CCTHLS FMDPGPE 
GKIiYSBDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPH PKLAPTS OKGGLDRFO AAVftTTCnr ,M ct ,ut v jv \c pt .wn 

NVHMYIiPTKIiFQASRPSFNLLDSPHPRQENQVPSVRVEIHLPRD 
QSGEVDFKALVLQLKETSSIiQEQADILYMIiYTMKGPDWNTELYN 
ERSATVRELLTELYGKVGE IRHWGLIRYI SGILRKKVEALDEAC 
TDLLSHQKHLTVGL PPEPREKT I S APLP YEALTQL I DBAS EGDM 
S I S I LTQE I M V YLAM YMRTQ PG LFAEM FRLRIGLII QVMATE LA 
HSLRCSAEEATEGLMN3LSPSAMKNLLHHI LS GKEFGVERK/S VR 
PTDSNVSPAISIHE IGAVGATKTERTG IMQLKSEIKQVEFRRLS 

ALNRVP VGFYQKVWKVLQKCHGLS VEGFVLPSSTTREMTPGE I K, 
FS VHVES \VLNVLLRPEYRQLLVEAI LVLTMLADI E IHS IGS 1 1 
AVEKIVHIANDLFLQEQKTLGP \DDTM LAKD PASG \ 1 CTLR \ YD 
SAPSGRFGTMTYLS \RAA\ ATYVQEFLP\HS ICAMQ 


5804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 
KDVWSS IQGQWPKKTLKELFSDSDTEAAAS P PHPAPEEG VAEES 
LQT VAEEES CSPSVELEKPPPVNVDSKP I EEKT VEVNDRKAEFP 
S SGSNFS A* I PLP YLHLNRtiHQS L * QKGSRQQSS VTVS E PLAPN 
QEEVRS I KS ETDST I EVDS VAGELQDLQSERE * LASRF * CQCKL 

QQKEGKRHK 


6805 


1533 


206 


RQPDLKYFGKS FDVSVS ES SSLLSNDLPKFADGI KARNRNQNYL 
VPSPVLRILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESP 
IEVHTAEDVPIAVEVHAISEDYDIETENNSSESLQDQTDEEPPA 
KLCKILDKSQALNVTAQQKWPLLRANSSGLYKCELCEFNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTEWLLIEHAK3bHEEDPYI 
CKYCDYKTVI FENLSQH I ADTHFSDHIjYWCEQCDVQFSSSSELY 
LHFQEHSCDEQYLCQFCEHETND PEDLHSHWNEHACKL IELSD 
KYNNGEHGQYSLLSKI TFDKCKNFFVCQVCGFRSRmTNVNRHV 
AIEHTKI FPHVCDDCGKGFSSMLE \ IAKHLNSHLSEGI YLCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERBLISHLP 
VHETT 


6806 


272 


3794 


VALCFPNSDPVMFMDAFYGCIiLAELGPVPIEVPLTRKDAGSQQV 
GFLLGS CGVFLALTTDACQKGLPKAQTGEVAAFKG WP PL S WI»VI 
DGKHIiAKPPKDWHPLAQDTGTGTAYI EYKTS KEGSTVGVTVSHA 
S LLAQCRALTQACGYS EAETLTNVLDFKRDAGLWHGVLTS VMNR 
M HWS VPYALM KANPLS W I Q KVCF YKARAAL VKS RDMHWSLLAQ 
RGQRDVSLSSLRMLIVADGANPWSISSCDAFLNVFQSRGIjRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLSVLTVQDVGQVMPGANVCWKLEGTPYLCKTDEVGEICV 
SSSATGTAYYGLIjGITKNVFEAVPVTTGGAPIFDRPFTRTGLLG 
F IGP DHLVF I VGKLDGLMVTG VRRJHNADDVVATALAVE PMKFVY 
RGRIAVFS VTVLHDDR IVLVAEQRPDASEEDSFQWMS RVLQAI D 
S IHQVGVYCLALVPANTLPKAPLGG IHI SETKQRFLEGTLHPCN 
VLMCPHTCVTNLPKPRQKQPEVGPASMIVGNLVAGKRIAQASGR 
ELAHLEDSDQARKFLFLADVLQWRAHTTPDHPIiFLIiLNAKGTVT 
STATCVQ LH KRAERVAAALME KGRL SVG DHVALVYP PGVDIiI AA 
F YGCLYCGC VP VTVRP PHPQNIX3TTLPTVKMI VEVS KS ACVLTT 
OAVTRIiLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
DVLAYLD FS VS TTG I LAG VKM S HAATS ALCRS I KLQCEL YP S RQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVSQYKARVTFCCYSVMEMCTKGLGAQTGVLRMKGVNLSCVRTC 
MWAE E RP \R IALTQS FS KL FKDLGL PARAVS TT FGCRVNVA I C 
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SEQ 
ID 
NO: 

« 


Predicted 
beginning 
nucleotide 
location 

j. c o jkJvJijiujL ii y 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hist±dine, I=Isoleucine, K-Lysine, 
L= Leucine, M^Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V«= Valine, 
W«Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDIV3RALRHDRVRLVERGSPHSLPLMESGKIL 
PGVKVI IAHTETKGPLGDSHLGEI WVSSPHNATG YYTV YGEEAL 
HADHFSARLS FGDTQTI WARTGYLGFLRRTELTDASGGRHDALY 
WGSLDETLELRGMRYHPIDIETSVIRAHRSIAECAVFTWTNLL 
WWELDGLEQDALDLVALVTNWTjEEHYLWGWVIVDPGVI p 
INSRGEKQRMHLRDGFLADQLDP I YVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSPPVWVTVSPRSEESHTTTVSGGNG 
. S VFQAGPQLQALANIiEARRGS I GAALSS RDVSGLP VYAQSGE P R 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGPCSPLSSGGGAE 
S LP PGG PGHAEAGHLG KVCDFHLNHQQ PS PTS VLPTE VAAP PLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLIiTEGCIiRSLSGDIiN 
RFPCX3MEVHSGQRELESWAVGEAMA\LKFPMGAMSYCLRDRSR 
FLFRLPMGLSCPLQVQ 


6808 


2063 


737 


GVGSGAASALARSRPIASRLSSRRRTRAPRSGAMQRLAMDLRML 
SRELiSLYLEHQVRVGFFGSGVGLSIi I LGFSVAYAF YYLSS IAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPF\ 
ITS KP PVQ YRNEL I KTADGGQ I SLDWFDNDNS TC YMDAS TRPTI 

lllpgltgtskesyilhmihlseelgyrcwfnnrgvagenllt 

PRTYCC^TEDLBTVimiWISIiYPSAPFIJViGVSMGGMbLLNYL 

gkigsktplmaaatfsvgwntfacseslekplnwllfnyylttc 
lqssvnkhrhmfvkqvdndhvmkaksirefdkrftsvmfgyqti 
ddyytdaspsprlksvgipvlclnsvddvfspshaipietakqn 
pnvalvltsygghigflegiwprqstymdrvfkqfvqamvehgh 

BLS 


6809 


939 


65 


DYSGQTPVt>TEHGMTLYTPAQTHPEQPGSEASTQPlAGTQTVPQ 

tdeaaqtdsqplhpsdptekqqpkruivsnipfrfrdpdlrqmf 

GQFGK I LDVE 1 1 FNERGS KGFGFVTFETSSDADRAREKIiNGTI V 
EGRKrEVNNATARVMTNKKTGNPYTNGWKLNPWGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPP I PTYG 
A WYQDGF YGAE I \ LEATQ PTDTIiS PLiQRRQ PTATVTAES TQLP 
TRTITPSGPRRPTAIiEPCETFHRFLLGP 


6810 
6811 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQWP^| 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGKI LDVE 1 1 FNERGS KGFGF VT FETSSDADRAREKLNGT I V 
EGRKI E VNNATAR VMTNKKTGNP YTNGWKLNP VVGAVYGP E F YA 
VTG FPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAP PPPP I PTYG 
A W YQDG FYGAE I \LEATQPTDTI*S PLQRRQPTATVTAES TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6812 


1522 


658 


DbVTVWSFVDCRVIASTHGHVKSWVSVVAFDPYTTSVEEGDPME 

FSGSDEDFQDLLHFGRDRADSTQCRIiSRRNSTDSRPVSVTYRFG • 

SVGQDTQLCLWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 

GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 

SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKbNLVTKTK 

TDPAKTLGTPLCPRMEDVPLLEPLICKKIAHERLTVLIFLEDCI 

VTACQEG FI CTWGRPGK WS FNP 




4001 " " 


1682 1 " ~ 

1 

1 


KoAvFSIjDLSTIIQGrWFLNGEEIiKSNEPEGQVEPGALRYRIEQ 
KGLQHRL I LHAVKHQDSGAL VG FS CPGVQDSAALTI QES P VH I L 
SPQDKVSLTFTTSERWLTCELSRVDFPATWYKDGQKVEESELL 
WKMDGRKHRLILPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVri 
TDVSSWIVYPSGKVYVAAVRLERWLTCELCRPWAEVRWTKDGE 
S WES PALLLQKEDTVRRLVLPAVQLEDSGEYLCE IDDESAS FT 
/TVTEP PVR I IYPRDEVTL I AVTLEC WLMCELSREDAP VRWYK 
DGLEVEESEAIiVLERDGPRCRLVIiPAAQPEDGGEFVCDAGDDSA 
?FTVTVTEPPVQFLALETTPSPLCVAPGEPVVLSCELSRAGAPV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosxne, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VWSHNGR PVQ EGEGLELHAEGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPS LS FTVQVAEP P VRWAP EAAQTRVRSTPGGDLE L WH 
LSGPGGP VR W YKDGERLAS GGR VQLEQAGAR Q VLR VQGARSGDA 
GE YLCDAPQDS R I FLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CE VS P PDADVTWLRNGAWTPG PQRQS CCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHL 
LHSRQGSQ IDQTECVI RMNDAPTRG YGRDVGNRTSLRVI AHSSI 
QRILRNRHDLLNVSQGTVFI FWGPSS YMRRDGKGQVYNNLHLLS 
QVLPRLKAFMITRHKMLQP0ELFKQETGQ\NRKrSNTWLSTGWF 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQEAN/ARERNRMHGLNDALDNLRKWPCYS KTQKLS KI ET 
LRLAKNYIWALSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRS PYSTFYP PYHS PELTTP PGHG 
TLDNS KSMKP YN YCSAYES FYESTS PECAS PQFEGPLSPPP IN Y 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPIiGQGAMFRIjPTD 
SH FP YDLHLRS QSLTMQDELNAVFHN 


6815 


906 


553 


QGLDPASQTKVVELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRSSEMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 


1 


803 


NLLKTHKF \ LLGQDEDSLHS VP VAQMGNYQE YLKTLAS PLRE I D 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSS KRRRSMS LLLRKPQTPPT VTNHVGGKGP PSAS WFPSYPN 
LIKPTLVHTDATIIHDGHEEKMENGQlTPDGFLSKSAPSEIilNM 
TGDLMPPNQVDSLSDDFTSLS KDGL I QKPGSNAFVGGAKNCS LS 
\^DQKDPVASTLGAMPNTLQITPAMAOGlNADIKHQLMKEVRKF 
GRSK 


6817 


172 


3457 


LGMMDSPKIGNGIiPVIGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DEYCPACKEKGKLKALKTYRISFQES I FLCEDLQCI YPLGSKSL 
NNIjISPDLEECHTPHKPQKRKSLESSYKDSIjLLANSKKTRNYIA 
IDGGKVLNS KHNGEVYDETSSNLPDSSGQQNP IRTADSLERNEI 
LEADTVDMATTKDPATVD VS GTGRPS PQNEGCTSKLEMPLESKC 
TSFPQALCVQWKNAYALCWLDCILSALVHSEELKNTVTGLCSKE 
ESI FWRLLTKYNQANTLLYTSQL S GVKDGDCKKLTSK I FAE I ET 
CLNEVRDEIFISLQPQLRCTLGDMESPVFAFPLLLKLETHIEKL 
FLYS FS WDFECSQ CGHQ YQNRHMKS LVTFTNV I P E WH PLNAAH F 
GPCNNCNSKSQIRKMVLEKVSPIFMI,HFVEGLPQNDIjQHYAFHF 
EGCLYQITSVIQYRANNHFITWIIjDADGSWLECDDIjKGPCSERH 
KKFEVPASEIHIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 

kpvsltscsvgdaasaetasvthpkdisvaprtlsqdtavthgd 
hllsgp kglvdn i lp lt le et i qktas vs qlns eaf l\lenkp v 

AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
WNTNMQS VQ LNTEDTVNTKS VNNTDATGL IQGVKS VE I E KDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPISKPPAGPP 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGISSANH 
EDLVEGQIHKLRLKLRKKLKAEKKKLAALMSS PQSRTVRSENLE 
QVPQDGS PNDCES IEDLLNELP YP IDIANESACTTVPGVSLYSS 
QTHEEILAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
EFNDVSQNTHLRQDHNYCS PTKKNPCE VQPDS LTNNACVRTLNL 
ESPMKTDIFDEFFSSSALNALANDTLDLPHFDEYLFENY 


6818 


2 


24 0 


RGFDKVLWT/ LSGAVK\CVQ FSRI S PDGEEG YPGELKVWVT YTL " 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 

LcolUUc OX 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C« Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, VWaline, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGGE / LHS / ATTEHKP / VQ AT P VNLT \ T I LTS TWQARIjPQI 


6819 


1 


961 


GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSLEICIKACKNLAY 
GEEKKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQLQVSVWHLGTLARRVFLGEVIIPLATWDFEDS 
TTQSFRWHPLRAKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTDQPSLHGQLCLWLGAKNLPVRPDGTX»NSFVKGCLTLP 
DQQKIiRLKS PVLRKQACPQWKHS FVFSGVT PAQIiRQS S LELTVW 
DQALFGMNDRLLGGT \ RLGS KGDTAVGGDACS QSKLQWQKVLS S 
PNLWTDMTLVLH 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGPNEKYLKWRKHHRViA 
GQFFGHHHTDSFRMLYDDAGVPISAMFITPGVTPWKTTLPGWN 
GANNPA IR VFE YDRATLSLKDMVT YFMNLS QANAQGTPR WELE Y 
QLTEAYG VPDASAHSMHTVLDR I AGDQS TLQRY YVYNS VS YSAG 

VCDEACSMQHVCAMRQVDIDAYTTCLYASGTTPVPQLPLLIjMAIi 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSIilEGYI \S IVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQ PLG FDECGI VAQI AGPLAAAD I SAYYI STFNFDHAIjVPEDGI 
GSVIEVLQRRQEGLAfi 


6822 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPIVHPIQSPQN 
RFCVLTLDPETIiPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVI EVLQRRQEGLAS 


6B23 


654 


221 


PPKLLSRWARMGHGDBIVMjSDLNFPGLLHLPWGPWRSVQTAC ' 
GIPQLLEAVLKLLPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YE 3 1 LRRAGCVRALiAKI ERFE F YERAKKAFAWATG E TAXi ygnl 
IIiRKGVLAXjNPLL 


6824 


858 


104 


LLIiAQR WGWG \ CCFFS1AVS VKMNVIiLFAPGLLFl*LLTQFGFRG " 
AIiPKLG ICAGLQWLCLPFLLENPSGYLSRS FDIX3RQFLFHWTV 
NWRFI,PEALFLHRAFHIiALI»TAHIiTIiLLLFALCRWHRTGES I LS 
LLRDPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 
WYFHT IiP YLLWAMPARWLTHLLRLLVliGIj IELS WNT YPS TS CS S 
AALHI CHAVILLQIiWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIILCSLMEPWAU5ACTFVHLL 
PKFDPLVILKTLSSYPIKSMMGAPIVYRMLLQQDLSSYKFPHLQ 
NCLAGGESLLPETLENWRAQTGLDIREFYGQTETGLTCMVSKTM 
KI KPG YMGTAAS C YDVQI IDDKGNVL PPGTEGD IG IR VKP IRP I 
GIFSGYVDNPDKTAANIRGDFWLLGDRGIKDEDGYFQFMGRADD 
X I NS S G YR IGPS E VENALMEH PAWETAVI SS PDP VRGE WKAF 
VILALOFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVIjNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPI,LSLSF 
PFGPIjALPMDGYGDSIiWEEHEYICPCTiAI.VrQTTrr VWror 


6826 


2304 


954 


LKTES FKPW/ VNIAliAFHLLGERASPNSFWQPY IQTLPRE YDTP - 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDS FT YBD YRWAVS S VMTRQNQ I PTEDGSRVTLAIil PLW 
DMCNHTNGIj ITTG YNLEDDRCECVALQDFRAGEQ I y I FYGTRS N 
AEFVI HSGFFFDNNSHDRVKIKLGVSKSDRLYAMKAEVLARAG I 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
R I FTLGNS EFP VS WDNEVKLW TFLEDRASLLLKTYKTTI E ED KS 
VLKNHDLS VRAKMAI KLRLGEKEILEKAVKSAAVNREYYRQQME 
EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVQDALNI 
REAI S KAXATENG L VNGENS I PNGTRS ENES LNQES KRAVEDAK 
GSSSDS TAGVKE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
I amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Hisfc idine , I^lsoleucine, K^Ijysine 
L^Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S= serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6827 


1 


1 779 


SSWEFGLSVIiGGLFLLFVLENMLGLLRHRGIiRPRCCRRKRRNL 
ETRNLDPENGSGMAJLQPLQAAPEPGAQGQREKNSQHPPALAPPG 
HQGHSHGHQGGTDITWMVLLGDGLHNLTDGLAIGAAFSDGFSSG 
LSTTLAVFCHELPHELGDFAMLLQSGLSFRRLLLLSIiVSGAIiGL 
GGAVLGVGLSLGPVPLTPWVFGVTAGVFLYVALVDMLPALFPSS 
GAP A YA\ HVLLQG LGLLLGG CLMLAI TLLEERLL PVTTEG 


6828 


3 


1654 


xsovnvj/ n j. jjwui v uic><jiuaux Vi^l*K.tiNFGIjHRAMLDIjDNGTRPSE 
LGHLSQTAS LKRGS S FQSGRDDTWR YKTPHRVAFVEKLTKLVLS 
QLPNFWKLWISYVNGSLFSETABKSGQIERSKNVRQRQNDF1CKM 
IQEVMHSLVKLTRGALLPLS I RDGEAKQYGGWEVKCELSGQWLA 
HAIQTVRLTHESLTALE I PNDLLQT1QDLILDLRVRCVMATLQH 
TAEEIKRIiAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLICGVIjE 
^*^r\*n~ti2> v r yvt'tvl \JcZt, VLvbblN 1MQVFI YCIjEQLSTKPDADI 
DTTHLSVDVSSPDLFGSIHEDFSLTSEQRIiLIVLSNCCYLERHT 

flniaehfekhnfqgiekitqvsmaslkeldqrlfenyielkad 
pivgslepgiyagyfdwkdclpptgvrnylkealvniiavhaev 

FTI S KEL VPRVLS KVI EAVS EELS RLMQ CVS S FS KNGALQARLK 

I calrdtvavyltpeskssfkqalealpqls sgadkklleelln 

: KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


[ 782 . 


mrmeageaappagaggraaggwgkwvrlnvggtvflttrqtjucr " 

*U*».o r i-*oKJj^y<jfc.b JjQSlJRDETGAYLIDRDPT YFGPILNFLRHG 

klvldkdmaeegvleeaefynigpliriikdrmeekdytvtqvp 
pkhvyrvlqcqeeeltqmvstmsdgwrfeqlvnigssynygsed 

QAEFtiCWSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 

eb veveqvqveadaqek/ccykpeapgceapdhlqgixsvp I i 


6830 


1 ! 


93 9 


MB PGS VEN.DS I vyrs RDFL WNKHWD VRIDS KAWRETI/JLQKQIj 

ryrfpeladpdtcygfrfchqldfstsgalcvalnkaaagsayr 
cfkerrvtkaylallrghiqesrvtishaigrnstegrahtmci 

EGSQGCENPKPSLTDLWLEHGIiYAGDPVSKVLLKPLTGRTHQL 
xv v \n»~i5AiAjxi*'vvULHj f xGBVSGKEDRPFRMMLHAFYIiRIPTDT 
ECVEVCTPDPFLPSLDACWSPHTLLQSLDQLVQALRATPDPDPE 
DRG P RPGS PS ALLPG PGRPPPP PTKP PETEAQRG PCLQWIiS EWT 
LEPDS 


6831 


3 ) 


1087 


SLFFGSSTPDWKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF ' 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQ PKF I S E VSRED YGKKE I SGDS EEMNI NS WTSADGENIj 
EIQSYSLIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
S I FKEE PRSDQKQKSLLS FDWD KVPQQPKSASSNFASKN I TKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
IjRS VS PTEKKDNLENR\ S YTL\AEKKVIiAEKQNS V\ APLELRDS 
NE IGKTQITIfGSRSTELKESKADAMPOHFYONPrivwPD dvt rirr> 
SEKEKDEKKKK 


6832 


1809 


412 


MGSGLI S GPPQDNSGEALKE P BRAQEHS LPNFAGGQHFFE YLLV 
VSLKKKRSEDDYEPIITYQFPKRENLLRGQQEEEERLLKAIPL.F 
CFPDGNEWASLTEYPRETFSFVLTNVDGSRKIGYCRRLLPAGPG 
PRLPKVYCIISCIGCFGLFSKILDEVEKRHQISMAVIYPFMQGti 
REAAFPAPGKTVTLKSFIPDSGTEFISLTRPLDSHLEHVDFSSL 
LHCLSFEQILQIFASAVLERKIIFLAEGLSTLSQCIHAAAALLY 
PFSWAHTYIPWPESLLATVCCPTPFMVGVQMRFQQEVMDSPMS 
EVLLVNLCEGTFLMSVGDEKDILPPKIKJDDILDSLGQGINELKT 
AEQINEHVSGPFVQFFVKIVGHYASYIKREANGQGHFQERSFCK 
ALTS KTNRRFVKKFVKTQLFSJbFIQEAEKSKNP PAG YFQQKILE 
YEEQKKQ/ TETKGKNCE I RAWNKND 


6833 


1 


1129 

< 


PI^TLSQCGUiPGHGHSHGGHGHGHGIiPKGPRVKSTRPGSSTOr" 
7APGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
3 VNGNJL VRE PDHMEL EEDRAGQ LNMRG VFLHVLGDALGSVI VW 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - " 
{A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








NAI^VFYFS WKGCSEGDFCVNPCFPDPCKAFVEI INSTHAS VYEA " 

GPCWVLYLDPTLCWMVCILLYTTYPLLKESALILLQTVPKQID 

IRNIiIKELRNVEGVEEVHBLHVWQIAGSRIIATAHIKCEDPTSY 

ME VAKT I KDVFHNHG I HATT 1 QPE FAS VGS KS S WPCE LACRTQ 

CALKQCCGTLPQAPSGKDAEKTPAVSISCI,BIjSNNLEKKPRRTK 

AENIPA\WIEIKN\IPNK\QPESSL 


6834 


j 78 


1151 


AGQERPAi>IWRLIiWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCGS S ASAYGWH * RLTPWS PGGS * HM * S S KA P VTQ ARE VLVAG P 
CS KLVLSGARG I VGTTVQVLVEAQQPLLLLFTG VWG LN L RAGE E 
SRAL*LIEEVTQVRDAHIiGNAWGCAQCI>SQGQVGSALAKALLE 
AAAAVRD CKE VLTVSGDXQQABVS VRL * VRD VCVEEAGCVE FGQ 
AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGEDQAARTRIiLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 

LQQWGDAL*ARE*APQriVLIiLLEDVAQLRTGKKA*DI»WDVE 
QLLRQL 


683S 


! 1 


834 


GIPAADR\EASi^LIKLDISRTFPNLCIFQQGGPYHDMLHSILG 
AYTCYRPDVGYVQGMS F I AAVL I LN LDT ADA F I AFSNLLNKPCQ 
MAFFRVDHGLMLT Y FAAFE VPFEENLPKLPAH FKKNNLT PD I YI» 
I DW1 FTL YSKSLPLDIACRI WDVFCRDGEBFJbFRTALGILXLFE 
DILTKMDFIHMAQFLTRIiPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAI* 


6836 


1 


850 


MSCGRPPPDVDGMITLKV\D^L1 k YRTSPDSLRRVFEKYGRVGDV 
YIPRE^HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/ RYGRRSRS YGRRSRS PR 
RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 
KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


TDGAAVAGWPGSDYFPGGTAP/GGPRTRRP\SGTSSSGSKASGP 
PNPPAQGDGTSLSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS PGQQQASGAAVGGSS AGBT 
RGAPTPHEKALTSPSWGKGAELLLGDQPDLIGSLDGGAKSDSSS 
PNVGEFASDEVSTSYANEDEVSSSSDNPQALVKASRSPLVTGSP 
KLPPRGVC3AGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGE CAVGASGAQNGDS ELGSCCS EAVKS AMS T I 
D L DS LMAEHS AAW YMPADKALVDSADDDKTLAP WEKAKPQNPNS 
KEAHDLPANKAS ASQPGS HLQCLSVHCTDDVGDAKARAS VP TWR 
SLHSDI SNRFGTFVAALT 


6838 
6839 


16 


499 


bTDTP ppkthmi hhs i sd ykatlrc walgfypme itltwqqdee 

DQTRDMELVETRPAGDGTFQKWAAWVPSGEE/Q/RYMCHVQHE 
GLPE PLTIiR WEQS S QPT I P I VG I VAGLVLLGA WTGAWSAVMC 
RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


6840 


1 


1195 


«rt*'i*u^3kjir'u±'iiiUj5iAr ^ ^ Kil ij is iiJUbWi j yVKKL»DALiIjSEP I PI HG "~ 
RGNFPTLS VQPRQIRAGG PQHPGGAG \ IHVHRVRLHGSAASHVL 
HPESGLGYKDLDLVFRMDLRSEASFQIiTKAWIiACLLDFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDS VRRQFE fsidsfqii ldslll FGQCSSTPMSE afhpt vtg 
ESLYGDFTEALEHLRHRVIATRSPEEIRGGGIiliKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
WUIR YACLVTLHRVVNES WCLMNHERRQTLDL I AALALQAIjAE 
2GPAATAAIAWRPPGTIX5VVTATVNYYVTPVQPLLAHAYPTWLP 




4254 


2061 J 
I 


i tiQGDFS VPDVPKSMAWCENS ICVGFKRDYYLIRVDGKGS IKEL 
? PTGKQIjE PLVAPLADGKVAVGQDDLTVVLNE EG I CTQKCAIjNW 
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ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 

I residue of 

1 amino acid 

J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K» Lysine, 
L^Leucine, M^Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V«=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, ♦^stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TOIPVAMEHgPPYIlAVJuPRYVEIRTFEPRLLVQSIELQRPRFl 
TSGGSNI I YVASNHFVWRLIPVPMATQIQQLLQDKQPEIiALQlA 
EMKDDSDS EKQQQ IHH I KNLYAFNLFCQKRFDESMQVFAKLGTD 
PTHVMGLYPDLLPTDYRKQLQYPNPLPVLSGAELEKAHIjAIjIDY 
LTQKRSQLVKKLNDSDHQSSTSPLMEGTPTIKSKKKLLQIIDTT 
LLKCYLHTNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
IiYEKKGLHEKALQVIjVDQSKKANSPLKGHERrVQYIiQHLGTENL 
HL I FS YSVW VIjRDFPEDGLK I FTE DLPE VES L PRDR VLGFLI EN 
F KG LA I PYliEHI IHWEETGSRFHNCLIQLYCEKVQGLMKE YLL 
SFPAGKTPVPAGEEEGEL»GEYRQKIiLMFLEISSYYDPGRLICDF 
PFDGIiLEERALLLGRMGKHEQALFIYVHILKDTRMAEEYCHKHY 
DRNKDGNKDVYLSLLnMYLSPPSIHCLGPIKLELLEPKANLQAA 
LQVLELHHSKLDTTKALNLLPANTQINDIRIFLEKVLEENAQKK 
RFNQVLKNLLHAEFXiRV\QEERILHQQVKCIITEEKVCMVCKKK 
I GNSAFARYPNGVWHYFCS\KEVNPADT 


6841 
6842 


1 


3206 


TPSTTGTKSNTPTSSVPSAAVTPLN^LQPtGDYGVGS'KNSKRA 
REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 
MCPETRLDRTGSSPTQGIVNKAFGINTDSLYHELSTAGSEVIGD 
VDEGADLLGEFSGMGKEVGNLLLENSQLLBTKNALNVVKNDLIA 
KVDQLSGEQEVLRGELEAAKQAKVKLENRIKELEEELKRVKSEA 
1 1 ARRE PKEE AEDVSS YL CTESDK I PMAQRRRFTRVEMARVLME 
RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SLPAKYKQLSPNGGQEDTRMKNVPVPVYCRPLVEKDPTMXLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATS SRVW I LTSTLTTSKW r I DANQ PGTWD 

QFTVCNAHVLCISSIPAASDSDYPPGEMFLDSDVNPEDPGADGV 
LAGITLVGCATRCNVPRSNCSSRGDTPVLDKGQGEVATIANGKV 
NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 
AQNGWLYVHSAVANWKKCMSIKLKDSVLSLVHVKGRVLVALAD 
GTLAI FHRGEDGQWDLSNYHLMDLGH PHHS IRCMAWYDRVWCG 
YKNKVHVIQPKTMQIEKSFDAHPRRESQVRQLAWIGDGVWVSIR 
LDS TLR L YHAHTHQHLQD VD I EP YVSKMLGTGKLGFS FVR I TAL 
LVAGSRIiWVGTGNG WIS I Pl/TET WLHRGQ\ LLG \ LRANKTS P 
TSGEG\ARPGG\IIHVYG\DDSSDRAARSFIPYCSMAQAQLCFH 
GHRDAVKFFVS VPGNVLATLNGSVIiDS PAEGPGPAAPASEVEGQ 

KLRNVLVLSGGEGYIDFRIGDGEDDETEEGAGDMSQVKPVLSKA 
ERSHI I VWQVS YTPE 


6843 


3 


926 


KCOgLSATILTUHQYl^RTPI,CAILKQKAPQQYRlRAKLRSYKP~ 

RRLFQSVKLHCPKCHLLQEVPHEGDLDIIFQDGATKTPDVKLQN 

TSLYDSKIWTTKNQKGRKVAVHFVKNNGILPLSNECLLLIEGGT 

LSEICKLSNKFNSVIPVRSGHEDLELLDLSAPFIiIQGTVHHYGC 
KQWST*RSIQNLNSLVDKTSWIPSSVAPAr 1 r;TVPT nwr™n»iw 

LDDGTGVLEAYLMDSDKFFQI PASEVLMDDDLQKS VDMIMDMFC 

PPGIKIDAYPWLECFIKSYNVTNGTDNQICYQIFDTTVAEDVI 


6844 


2 


8S1 

1 


NHRKVLSGAKKYECNECGKSFAYTSSLIKHRRIHTGERPYECSE 
CGRSFAENSSLIKHLRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSIi I IHIjRVHTGERpyECSDCGKSFAEWSSLIKHLR VHTG E 
RPYECIDCGKS FRHS SSFRRHQRVHTGMRPYK* S KFWKFSCPGF 
LLLQGQRVHTGS RC YECDKWG I FFS *NAS FFT* KSAPTEEVPFE 
aJECEKAFSPLSLVTTlFT 




244 


642 ] 
< 


SrtULAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
JYDVKSEIYSFGIVLWEIATGDIPFQGCWSEKIRKLVAVKRQOE 
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ID 
NO: 


rxcuj, c L6Q 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S« Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X«=Unknown, *«=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








PLGEDCPSEIiREHDECRAHDPSVRPSVDEILKKLSTFSK*CIK 


6845 


3 


1519 


VAVRDECYWRHVFWDQDLWMLLPIIxMCHPETARARLBYRIRTtD' " 
GALENAQNLGYQGAKFAWESADSGLEVCPEDIYGVQEVHVNGAV 
GLAFELY YHTTQDLQL PREAGG WD WRAVAE FWCSRVE WS PREE 
K YHIiRG VMS PDE YHSG VNNS VYTNVL VQNS LR FAAALAQDLGL P 
IPSQWLAVADKIKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 
YPVPFSLSPDVRRKNLE I YEAVTS PQGPAMTWSMFAVGWMELKD 
AVRARG LLDRS FANMAE PFKVWTENADGSGAVNFLTGMGG FLQA 
WFGCTGFRVTRAGVTFDPVCLSGISRVSVSGIFYQGNKLNFSF 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGR I QMS P P KLPGS S S S EFPGRTFSDVRDPLQS PLWVTLGS SS P 
TESLTVDPASE*SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 


I*YFLKTIK* LNRLAEHP * YENEKLTKLRNTIMEQYTRTEESARG 
1 1 FTKTRQS AYALSQ W I TENEK FAE VGVKAHHL I GAGHSS E FKP 
MTQNEQKBVISKFRTGKINLLIATTVAEEGLDIKECNIVIRYGL 
VTNEIAMVQARGRARADESTYVLVAHSGSGVIEHETVNDFREKM 
M YKA IHCVQNMKPEE YAHKI LELQMQS IMEKKMKTKRNI AKHYK 
NNPSLITFLCKNCSVLACSGEDIHVIEKMHHVNMTPEFKELYIV 
RENKTLQKKCADYQINGEI ICKCGQAWGTMMVHKGLDLPCLKIR 
NFVWFKNN S TKKQ YKKWVE LP ITF PNLD YS E CCLFSD ED 


6847 


1450 


348 


SMCWNSDRLEMPLlDIiALILYPPSYVPYTGHLSDDSLSRKYCL 1 ]? ' 
WFEDALNG VL* RAE AI Q PH CVNAGDRME KFRQK YWNKLQ TLRQQ 

PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 
PGWRS LDALG WEERQIiALVKGLIiAGNVFDWGAKAVSAVLES DP 
YFGFEEAKRKLQERPWLVDSYSEWLQRLKGPPHKCALIFADNSG 
IDIILGVFPFVREXjLLRGTEVILACNSGPALNDVTHSESLIVAE 
R I AGMD PVVHS ALREERLLLVQTGS S S PCLDLS RLDKGLAAL VR 
ERGADLWI EGMGRAVHTNYHAALRCESLKLAVI KNAWIiAERLG 
GRLFS VI FKYE VPAE 


6848 




16 


AMWWNSI^IRNIVLSNPKKROTLSLAMLKSLQSDILHDADSND'"" 
LKVIIISAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
I RNHP VP VI AM VNGLATAAGCQL VAS CDI AVAS DKS S FAT PGVN 
VGL FCS TPG VALARAVPRKVALEML FTGEP I SAQEALLHGLLNK 
WPEAELQEETMR XARKIAS LSRP WSLGKATF YKQL PQDLGTA 
YYLTSQ AMVDNLALRDGQ EG I TAFLQKRKPVWSHEPV* VEH 


6849 
6850 


70 


821 


SLGVDGSCLEQGS PAPRPQTDTSP * P VGNWATQQEDLYHQS YEC 
VCVLFAS VPDFKEFYSESNTNHEGLECLRLLNE I IADFDELLS K 
PKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAERSCSHLGTM 
VEFAVALGSKLDVINKHS FNWFRLRVGLNHGP WAGVlGAQKPQ 
YDIWGNTVWASRMESTGVLGKIQVTEETAWALQSLGYTCYSRG 
VI KVKGKGQLCTYFLNTDLTRTGPPS ATLG 




2 


1235 


ARGIiiraEWTFEKIiRQHISRNAQDKQELHLFMIiSGVPDAVFDLTD 
LDVL KLEL I PEAKI PAK I S QMTNLOE LHLPHC? PMn/i?nT npoor 

RDHLRCI^KFTDVAEIPAWVYLLKNLRELYIjIGNLNSENNKMI 
GLESLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVLNSLKKMMNVAELELQNCELERIPHAIFSLSNLQELDLKS 
NNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNIjESL 

yfsnnkleslpvavfslqklrcldvsynnismipieigllqnlq 

H LH I TGNKVD I LP KQLFKC I KLRTLNLGQNC I TS L PEKVGQLS Q 

ltqlelkgncldrlpaqlgqcrmlkksglwedhlfdtlplevk 
ealnqdinipfangi 


6851 " 


1765 


660 

] 
< 


/saqvsaregenclgwnladssqesyksleeaedcyppslltld 

bRDLFWQVEQGPLLSCPKAGTDLSMGRARBVGWMAAGLMIGAGA 
3YCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

tcbpouulJiy 

to first 
amino acid 
residue of 

sequence » 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=*Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T=Threonine, V^Valine, 
W=:Tryptophan, Y=Tyrosine, X^Unknown, *^Stop 
Codon, /=possible nucleotide deletion, 
\»poseible nucleotide insertion) 








MAR P WTEDGDWTE PGAPGGTEDRPS GGGKANRAH P I KQRP FPYE " 
HKNTWSAQNCKNGSCVLDLSKCLFIQGKLLFAEPKDAGFPFSQD 
I NS H LA S L S MARNTS PTPD PT VREAL CAPDNLNAS I ES QGQ I KM 
YINEVCKET VSRCCNS FLQQAGI^LLISMTVINNMLAKSASDLK 
F PL I SEGSGGAKVQVLKPLMGL SEKP VLAG E LVGAQML FS FMS Ii 
F I RNGNRE I LLETPAP 


6852 


1 


407 


RTRGE E T YANFIKHNDG KNI FYAART PATLFAVMFAMY IIS GLT 

GFI GLNS I AVLCNL VMGLALI FLCTWAYVKYSGE FR E IGTVI DQ 

IAETLWEQVLKPLGDNLMEENIRQSVTNSIKAGLTDQVSHHARL 
KTD 


6853 


3 


469 


GDS CAVCIELYKPNDL VRILTCNHI FrikTCVDPWLLEHRTCPMC 
KCD I LKALG I EVDVEDGS VSLQVPVSNEI FNSAS S HEEDNRSET 
AS S G YAS VQG TYEP PLEEHVQSTNE SLQLVNHEANS VAVD V I P H 
VDNPTFEEDETPNQETAVREIKS 


6954 


1148 


585 


HE S Y I GTFD PG EL CVCAAI QWLQDNS AS YFLNRKLVYE PS TQAK " 
P VKNT FLRMW I YSHH I YQQDLRKKI LD VG KR LD VTGFCMTG KPG 
IICVEGFKEHCEEFWHTIRYPNWKHISCKHAESVETEGNGEDLR 
LFHS FEELLLEAHGDYGLRNDYHMNLGQFLEFLKKHKSEHVFQI 
LFGIESKSSDS 


6855 


1913 


1148 


GRVGGRVGRICSPLSGANEYIASTDTLKTEEVLLFTDQTDDLAK' 
EEPTSLFQRDSETKGESGLVLEGDKEIHQlFEDLDKKLAIiASRF 
YI PEGCIQRWAAEMVVALDALHREG IVCRDLNPNNILLNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITEETEACDWWSL 
GAVLFELLTGKTLVECHPAGINTHTTLNMPEWVSEEARSLIQQL 
LQFNPLERLGAGVAGVEDI KSHPFFTPVDWAELMR 


6856 


1617 


■ g97 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR 
T VYR LTLVKAWNVDELQAYAQLVSLGNPDFI E VKGVT YCGES SA 
S S LTMAHVP WHEE WQFVRELVDLI P E YEIACEHEHSNCLLI AH 
RKFKIGGEWWTWINYNRFQELIQEYEDSGGSKTFSAKDYMARTP 
HWAL FGASERG FDF-KDTRHQRKNKS KAISGC 


6857 


1 


617 


KGPEATAMVCVCSHPNCRQNHIKPSHSAAQTWCGSPTPASAPNH 
KLMAMEQGKTLPSATEDAKEEGLEAQISRLAELIGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYLRGT 
TGVRNCFHI TAVRLSDGFTFVI YEFWETEEAWKRHLQSPLCKAF * 
RHVKVDTLS Q P EALS R I LVPAAWCTVGRD 


6858 


•2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA 
LRCVQTAKLILEELKLEKKIKIRVEPGIFEWTKWEAGKTTPTLM 
S LEEL KEANFN I DTD YRPAFPLSALM PAES YQE YMDRCTAS MVQ 
IVNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECGDFAQLVR 
KIPSLGMCFCEENKEEGKWELVNPPVKTLTHGANAAFNWRNWIS 


6 853 




1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDIIQSPSSTGLLKSG 
KTNSVESLPELLTSDSEGSYAGVGSPRDLQSPDFTTGFHSDKIE 
AKVKP YVNGTS PVYSREDLKP WEKS P I LKIS APQP I PSNRI DTT 
S SAS WVAGS FS P VS PPWDLRTIME I E ESRQKCGATP KSHLGKT 
VSHGVKLSQKQRKMIALTTKENNSGMNSMETVLFTPSKAPKPVN 
AWASSLHSVSSKSFRDFLLEEKKSVTSHSSGDHVKKVSFKGIEN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAPVTFASIVEEELQQEAALIRSREKPLAIjIQIEEHAIQDLLVF 
YEAFGNPEE F V I VERTPQGPLA VPM WNKHG C 


6860 


1889 


1515 


dkdkkrqkkrgifpkvatnimrawlfqhlthpypseeqkkqlaq 

DTGLTILQVNNWFXNARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
MGS FVLDGQQHMG 1 RPAGPMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 1 

] 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ 
DTGLT1LQVNNWFINARRIIVQPMIDQSNRAVS0GAAYSPEGQP 
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SEQ 
ID 
NO; 


1 Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E«* 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
LoLeucine, M=Methionine, N=Asparagine, ! 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, XsUnknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNSEGNA 
DEEDPLGPNCYYDKTKSFFDNISCDDNRBRRPTWAEERRLNAET 
FGIPLRPNRGRGGYRGRGGLGFRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGRE FADFE YRKTTAFGP 


6863 


2216 


487 


PQ E PALKS E FS QVASNT I PL PL PQ PNTCKDNGPCKQ VCS TVGGS 
AI CSCFPG YAIMADGVSCEDQDE CLMGAHDCS RRQFCVNTLGSF 
Y C VNHTVLCADG Y I LNAHRKCVDI NECVTDLHTCSRGEHCVNTL 
GSFHCYKALTCEPGYALKDGECEDVDECAMGTHrOQPGFLCQNT 
KGS F YCQARQRCMDG FLQD PEGNC VD I NBCTS LS EP CRP G FS C I 
NT VGS YTCQRNPL I CARG YHAS DDGTKCVD VNEC ETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GS YQCYCRQG YQLAEDGHTCTDI DECAQGAG I LCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDECALGTHNCSEAETCHNIQGS 
FR CLRFE CP PNYVQVS KTKCERTTCHDFLECQNS PAR I THYQLN 
FQTGLiLVPAHI FRIGPAPAFTGDTIALNI IKGNEEGYFGTRRLN 
AYTGVVYLQRAVLEPRDFALDVEMKJjWRG^SVTTFLAKMHI fft 
TFAL 


6864 


2 


2933 


LADSS PSNLQI IIKELLSMHHQPDPALTKEFDYLPP VDSRSSSG " 
FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 
FYQVQSLFGHLMES KLQYYVPENFWKI FKMWNKELYVREQQDAY 
E FFTS LIDQMDE YXjKKMGRDQ I FKNTFQGI YSDQKI CKDCPHRY 
E RE EAFMALNLG VTSCQS LE I S LDQF VRGEVLEG S NAY YC E KCK 
E KR I TVKRTC I KSLPSVLVI HLMRFG FDWESGRS I K YDEQ I R FP 
WMLNMEP YTVSGMARQDSSS E VGENGRS VDQGGGGS PRKKVALT 
ENYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 
EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMLFY 
QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 
PHR PNNDRLS I LTKLV KKGE KKGLFVE KMPAR I YQMVRDENLKF 
MKNRD VYSSD YFS FVLSLAS LNATKLKHP Y YPCMAKVSLQLAI Q 
FLFQTYLRTKKKLRVDTEEWIATIEALLSKSFDACQWIiVEYFIS 
5EG RE L 1 KI FLLECNVRE VRVAVAT ILEKTLDSAL F YQDKLKS L 
HQLLEVLLALLDKDVPENCKNCAQYFFLFNrFVQKQGIRAGDLL 
LRHS ALRHM I S F LLGASRQNNQ I RRWSS AQAREFGNLHNTVALL 
VLHSDVSSQRNVAPGIFKQRPPISIAPSSPLLPLHEEVEALLFM 
SEGKPYLLEVMFALRELTGSLLALIEMWYCCFCNEHFSFTMLH 
FIKNQLETAPPHELKNTFQLLHEILVIEDPIQVERVKFVFETEN 
GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYFKENSHH 
WS WAVQWLQKKMS EHYWTLQ SNVSNETS TGKTFQRTI S AQDTLA 

YATALLNEKEQSGSSNGSESSPANENGDRHLQQGSESPMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
nuywuiu x*i\lHLte X UKLiAGTAJjCVAAGVLLAI CLFWAMIGWLSQ 
DTKAE PLDPEAD SHVE V FGDE P EQQLS P I FRNASGQS W FS P PAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


D CPRPR YT L YGLRATCMRDLD WAW INAVSAFKALEQDL P VN I KF " 
IIEGMEEAGSVALEELVEKEKDRFFSGVDYIVISDNLWISQRKP 
AI T YGTRGNS YFMVE VKCRDQD FHS GT FGG I LHE PMADLVALLG 
SLVDSSGHILVPGIYDEWPLTEEEINTYKAIHLDLBEYRNSSR 
VEKFLFDTKEEILMHLWRYPSLSIHGIEGAFDEPGTKTVIPGRV 
IGKFS I RLVPHMNVS AVEKQVTRHLEDVFSKRNS SNKMWSMTL 
GLHP W I ANI DDTQ YIiAAKRAI RTVFGTE PDM I RDGS TI P I AKM F 
QEIVHKSWLI PLGAVDDGEHSQNEKINRWNYIEGTKLFAAFFL 
EMAQLH 
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Predicted " 
beginning 
nucleotide 
location 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 
2833 



SEQ 
ID 

NO: 



6867 



"6868- 



6869 



6870 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1704 



346 



1619 



1566 



209 



6872 



880 



6873 



1929 



1126 



Amino acid segment containing signal peptide — 
<A=Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, k=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 



GTRI MSQPKQKELAGFVRQKMLLD YSVYMGRCVPQESRSPQRS P " 
LQSAESSPTAGKKLPBVPPSEEEEQBAWVNALLGRIFWDFLGEK 
YWSDLVSKK I QMKLSKI KIjP YFMNELTLTELDMGVAVPJCI IjQAF 
KP YVDHQGLWI DLEMS YNGS FLMTLETKMNLTKLGKEPLVEALK 
VGE I GKEGCRPRAFCLADS DEES S SAGS SEED DAPEPSGGDKQL 
L PGAEG YVGGHRTS KI MR FVDK I TKS K YFQ KATETE F I KKK I EE v 
VSNTP LLiLTVE VQECRGTliAVNl PP PPTDRVWYGFRKP PHVELK 
ARP KLGER E VTL VHVTD W I E KKLEQE FQKVFVM PNMDDVYI TIM 
HSAMDPRSTS CLLKPFPVEAADQ P 

RPTRPPTRPEEIKNLILPYI S'DMWFVQD^EDFYELFKTDKGFD 



- v v^uwiur i oi_»zr a. I UPj&tU 

KATFESQMS VMRGQ I LNLTQALRDG KS PFQLVQI PCVIVERSQG 
GSQGRI VHLSNS FTQTVNCRKPFFSSW 



MYMBRMDKRALI S FWES VEHLKNANK NEI PQIiVQE I YQNFFVES 
KE I S VEKSI/YKE I QQCLVGNKGI EVFYKIQEDVYETLKDR Y YPS 
FlVSDIiYEKLLIKEEEKHASQMlSNKDEMGPRDEAGEEAVDDGT 
NQ I NEQAS FAVNKLRELNE KL E Y KRQ ALNS IQNAPKPDKK I VS K 
LKDEIILIEKERTDLQLHMARTDWWCENLGMWKASITSGEVTEE 
NGEQLPCYFVMVSLQEVGGVETKNWTVPKRLSEFHNUHRKLSEC 
VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 
LCQSEALYAFLSPSPDYLKVIDVQGKKNSFSLSSFLERLPRDFF 
S HQEE ETEEDSDLSD YGDD VDGRKDA tiAE PC FMIi IGE I FE I*RGM 
FKWVRRTLI ALVQ VTFGRT I NKQ I RDTVS WI FS EQML VY Y I N I F 
RDAFWPNGKliAPPTTIRS KEQSQETKQRAQQKLIiENI PDMLQSIj 

VGQQNARHGIIKIFNALQETRANKHLbYALMELLIilELCPELRV 
HLDQLKAGQV 



MAAWAATRWWQLLiVLSAAGMGASGAPQPPNILLLI^DDMGWG 
DLGVYGEPSRETPNLDRMAAEGLLFPNFYSANPLCSPSRAAIiLT 
GRLPIRNGFYTTNAHARNAYTPQEIVGGIPDSEQLLPELLKKAG 
YVSKIVGJWHLGHRPQFHPIiKHGFDEWFGSPNCHFGPYDNKARP 
NIPVYRDWEMVGRYYEEFPINLKTGEANI,TQI YLQEALDFI KRQ 
ARHHP FFLYWAVDATHAP VYASKPFLGTSQRGR YGDAVRE I DDS 
IGKILEIiliQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLC 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGS IMDLFTTSLAL 
AGLT P PS DRAIDG LNLLPTLLQGRLMDRP I FY YRGDTLMAATLG 
QHKAHFWTWTNSWENFRQGIDFCPGQNVSGVTTHNLEDHTKJuPL 
I FHLGR D PGERFPLS FASAE YQEAL SRI TS WQQHQEALVPAQ P 
QLNVCNWAVMNWAPPGCEKLGKCLTPPESIPKKCLWSH 



459 



955 



RMSI^PPIFLKRSEENSSKFVETKQSQTT SIASEDPLQNLCLAS" 
QSVLQKAQQSGRSKCIiKCGGSRMFYCYTCYVPVENVPIEQIPLV 
KLPLKIDI IKHPNETDGKSTAIHAKLLAPEFVNI YTYPCI PE YE 
EKDHE VAli I FPGPQS I S IKD I S FHLQKR I QNNVRGKNDDPDKPS 
FKRKRTEEQE FCDLND S KCKG TTI*KKI I FIDSTWNQTNK I FTDE 
RLQGLLQVEIjKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
D ILKE KY RG QYDNLLFF YS FM YQL I KNAKCSGDKETG KLTH 

FGLLMWLS LI FMKGNCVREDLI FNFLFKLGLDVRETNGLFGNT 
KKLI TEYFVRQKYLE YRRI P YTEPAE YEFLWGPRAFLETSKMIiV 

LRFLAKLHKKDPQSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 



DEQAVLCSKDKTYDLKIADTSNMLLFIP GCKTPBQLKKEDSHCN 
IIHTEIFGFSNNYWELRRRRPKLKKLKKLLMENPYEGPDSQKEK 
DSNSSKYTTEDIiLDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
D YEMKLLNHVTQI* VDS ES WS FGKVP LNTC LQELG PLEPE EMI EH 
CLKC YGKK YVDEGEVYFE LDADKI CRAAARMIiLQNAVKFNIiAE F 
QEVWQQS VPEGMVTSLDQLKGIALVDRHSRPEI IFLLKVDDLPE 
DNQERFNS LFSLREKWTEED LAP Y I QDLCG EKQT I GALLTK YSH 
SSMQNGVKVYNSRRPIS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H^Histidine, I=*lsoleucine, K^Lysine, 
L=Leucine, M^Methionine, N^Asparagine , 
PsProline, Q^Glutamine,. R=Arginine, 
S^Serine, T=Threonine, V«Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«=possible nucleotide insertion) 


6874 


1 


307 


DS IADHVNSAAVI^EEGTKNLGKAAKYKLAAIjPVAGAIiIGGMVG 
GPIGLLAGFKVAGIAAALGGGVLGFTGGKLIQRKKQKMMEKIiTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGERGNSASEKWEIMFNEELGDPFIIlHSrSLIiNAEEHSIA 
TLLLRIEKBELDMKGSGFYVSLEWVTISKKNQDNKKYEIIKRDI 
LRG KS VPH YAAI E PDGNGLMI VS YKS LT FVQAGQDLBENMDED I 
SEKIKEPLYYWOQTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VJUKDHQFLEGKLYSSIDHESSTWIIKESNSLEISLIKKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEE CD I FFEESSSLCR FDGNTLKTTHWNLGSNQ YLFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGW 
QASKRDKKFFACAPNYSYAALCECLRRVFIYRQPAPMSTVLYNR 
KEGRQ VGQ VAKQQVAS LETNDP I LG FQATNERLF VLTTKNL FL I 
KVNTEN 


6876 


41 


1285 


VGEMTLIWRHLbRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS 
LHTKPRMPPC3>FMPERYQVIFLVNSGSEANELAMI*MARAHSNNI 
DI I S FRGAYHGCSPYTLGLTNVGI YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 
VAKSIAGFFAEPIQGVNGWQYPKGFLKEAFELVRARGGVCIAN 
E VQTGFGRIX3S HF WGFQTHDVLPDI VTMAKG I GNG FPMAAVI TT 
PE I AKS liAKCLQHFNTFGGNPMACAIGS AVLE VI KEENLQ ENS Q 
EVGT YMLIiKFAKLRDE FE I VGDVRGKGLM IGI E MVQDK I S CR PL 

PREEVNQIHEDCKHMGLLVGRGSIFSQTFRIAPSMCITKPEVDF 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


GTS P S PARA YAP PTERKR FYQNVS I TQGEGGFE INLDHRKLKTP 
QAKL FTVPS EALAI AVATE WDSQQDT I KYYTMHLTTLCNTSLDN 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVEIjQRNEWDPI I 
EWAE KRYGVE I SSSTS IMG PS I PAKTR E VLVS HLAS YNTWALQG 
IEFVAAQLKSMVLTLGLIDLRLTVEQAVLLSRIiEEEYQIQKWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


Q^TLQGDF KNRAEM I DPI) i RI kNVTRSDAGK YRCE VSAPSEQGQN 
LEEDTVTLEVLVAPAVPS CEVPSSAIiSGTWELiRCQDKEGN PAP 
EYTWFKDGIRIjLENPR]X?SQSTNSSYTMNTKTGTIX3FNTVSKLiD 
TGEYSCEARNSVGYRRCPGKRMQVDDI^ISGI IAAWWALVIS 
VCGLGVCYAQRKGYFSKETSFQKSNS SSKATTMS ENDFKHTKSF 


6879 


3 

t 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRVK ' 
KNSTTDQVYQAIAAKVGMDSTTVNYFALFEVISHSFVRKLAPNE 
FPHKLYIQNYTSAVPGTCL T I RKWLFTTEEEI I»LNDNDLAVTYF 
FHQAVDDVKKGY I XAEEKS YQLQKLYEQRKMVM YLNMLRTCEG Y 
NEIIFPHCACDSRRKGHVITAISITHFKLHACTEEGQLENQVIA 
FEWDEMQRWDTDEEGMAFC FE YARGEKKP RWVK I FT P Y FN YMHE 
CFERVFCELKWRKEEY 


6660 


2110 


1437 


RKDNCTAKEWTFPEAKWNTTARVFSHIRLGMGHVLIIVQCFISS 
MANIYNEKILKEGNOLTRQT'B'TnMQTrf vttbt'tt pmot mr r*r s\nr> 

NRDQI KN CGFFYGHRAFS VAL I F VTAFQGLS VAF I LKFLDNMFH 

VLMAQVTTVI I TTVS VLVFDFRPSLEFFLEAPSVLI»S I F I YNAS 

KPQVPEYAFRQERIRBLSGNLWERSSGDGEELERI/TKPKSDESD 
EDTF I 


6881 


2638 


2244 


NDS KWEDI HV I TGALKMFFR ELPE PJj FT FNHFND F VNA I KQE PR ' 
QRVAAVKDL I RQLP KPNQDTMQ I L FRHLRRV I ENGEKNRMT YQS 
IAIVFGPTLLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNMVTARQEPRLVLISIiTCDGDTIiTLSAAYTKDLLLPIKTPT 
TNAVHKCRVHGLE I EGRDCGEATAQWITS FLKSQP YRLVHFE PH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 



566 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, O Cysteine, D-^Aspartic Acid, E» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H«Histidine, I=:Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RaArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X-Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








KVKATNFRPNIVISGCDVYAEDSWDELLIGDVELKRVMACSRCI 
LTTVDPDTGVMSRKEPLETLKSYRQCDPSERKLYGKSPLFGQYF 
VLENPGTIKVGDPVYLLGQ 


6883 


2794 


2256 


NS KLKLNQNL KLF I TLT YQ VLS LHGWGPG I HLQKEGAF P VTQNR 
ALQLL YDLR YLNI VLTAKGDE VKSGRSKPDSRI EKVTDHL BAL I 
DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 
NSQEPHNILPLASSQIRFGLLPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


EFERVTAEAVKPRETSEPRAAAQRFCEKFPFL 


6885 


297 


1554 


STGQFWHVTDLHLDPTYH ITDDHTKVCASSKGANASNPG PFGDV 
LCDS P YQLILS AFDFI KNSGQEAS FM I WTGDS PPHVPVPELSTD 
TVINV I TNMTTT I QSLFPNLQVFPALGNHD YW PQDQLS WTS KV 
YNAVANLWKP WLD E EA I S TLRKGG F YSQ KVTTNPNIjR I ISLNTN 
LYYGPNIMTLNKTDPANQFEWLESTLNNSQQNKEKVYIIAHVPV 
GYLPSSQNITAMREYYNEKLIDIFQKYSDVIAGQFYGHTHRDSI 
MVLSDKKGSPVNSLFVAPAVTPVKSVLEKQTNNPGIRLFQYDPR 
DYKLLDMLQYYLNLTEANLKGESIWiCLEYILTQrYDIEDLQPES 
L YGLAKQFT I LDS KQFI KYYNYFFVS YDSSVTCDKTCKAFQ I CA 
IMNLDNIS YADCLKQLY I KHNY 


6886 


2 


1341 


QCGG I PGREGGSSRPLEEGTGSSPACVRGAAPGSEDAFYPTRAK 
QARVSQELKKAAKRTVSISEGPDTLGDGMRERRETLALAPEPEP 
LEKEACEKWKRPFRSASATSLTLSHCVDWKGLLDFKKRRGHSI 
GGAPEQRYQI I P VCVAARL PTRAQD VLDAHLSE VNAVR FGPNSS 
LLATGGADRLIHLWNWGS RLEANQTLEGAGGS ITS VDFDPSGY 
Q VLAAT YNQAAQLWKVGEAQ S KETLS GHKDKVTAAKFKLTRHQA 
VTGSRDRTVKE WDLGRA YCSRTINVLS YCND WCGDHI I ISGHN 
DQKIRFWDSRGPHCTQVIPVQGRVTSIiSL3HDQLHLLSCSRDNT 
LKVIDLR VSNI RQVFRADGFKCGSDWTKAVFS PDRS YALAGS CD 
GALYIWDVDTGKLESRIiQGPHCAAVNAVAWCYSGSHMVSVDQGR 
KWLWQ 


6887 


104 7 


116 


WTARPSQKPFWEAGAVPGDPIjSTGCSQAQLGGCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQS PRSPAGP FRGGTG WWPE PAVCLCVAVGPQRLS S PGLVY 
NASGSEHCYDIYRLYHSCADPTGCGTGPDARAWDYQACTEINIjT 

fasnnvtdmfpdlpftdelrqrycldtwgvwprpdwlltsfwgg 
dlraasniifsngnldpwagggirrnlsasviavtiqggahhld 
lrashpedpaswearkleati igewvkaarreqqpalrggprl 

SL 


6888 


1 


992 


FVAYVKKEI PHIVVTHCLLNPHALVI KTLPTKLjRDAIjFTVVRVT 

nfikgrapnhrlfqaffeeigieysvllfhtemrwlsrgqilth 
ifemyeeinqflhhkssnlvdgfenkefkihlayladlfkhlne 
lsasmqrtgmntvsareklsafvrkfpfwqkriekrnftnfpfl 

EEIIVSDNEGIFIAAEITLHLQQLSNFFHGYFS IGDIiNEASKWI 
LDPFLFNIDFVDDSYLMKNDIiAELRASGQILMEFETMKLEDFWC 
AQFTAFPNLAKTALE 1 lmpfattylcelgfs ITFTFQNKVPEAA 

lilsdd irvai skkvpsflghh 


6889 


1 


1534 


ltlenq i ke ereqdns espngrts plvsqnneqgstlrdl.lttt 
agklrvgstdagiafapvysmgapssksgrtmpnilddiiasw 

ENKIPPSKTSKINVKPEIjKEEPEESIISAVDENNKLYSDIPHSW 
ICEKHILWLKDYKNSSNWKLFKEa"IKQGQPAVVSGVHKKMNISL 
WKAESISLDFGDHQADLIjNCKDS I isnanvkefwdgfeevskrq 
knksgetwlklkdwpsgedfktmmparyedllkslplpeycnp 
egkfnlashlpgffvrpdlgprlcsaygwaakdhdigttnlhi 
evsdwnilvyvgiakgngilskagilkkfeeedlddilrkrlk 

DSSEIPGALXiTHIYAGKDVDKIREFLQKISKEQGLiEVLPEHDPIR 
DQS W YVNKKLRQRLIjEE YG VRTWTIi 1 QFLGDAI VLPAGALHOVQ 
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SEQ 
ID 

WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*=Lysine, 
L=Leucine, M=*Methionine, N^Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
s=serine, T=Threonine, V» valine , 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








N FHS C I QVT EDFVS P EHLVES FHIiTQ E LRLLKEE I N YDDKLQVK 
NXLYHAVKEMVRALKIHEDBVDDMEEN 


0O3U 


3 


667 


THACGMWI PLYLHRALWHKTAETCNSPPCGAKDSL'i'FGAITCF 
TG FLGVDTGAGATRW CRLKTQRADP LVCAVGMLGS A IFICLIFV 
AAKSS I VGAYI CI FVGETLLFSNWAI TADI LMYWI PTRRATAV 
ALQSFTSHLLGDAGSPYLIGFISDLIRQSTKDSPLWEFLSLGYA 
LMLCPFWVJUGGMFFLATALFFVSDRARAEQQVNQLAMPPASVK 


6891 


1980 


1262 


lrihqellskelkx^lrgitiesiihiglaagkeqfmOdasnvmcT" 
l llktqshlynmednnp e vrqaaa yglg vmaqfggdd yrs lcs e 

AVPLLVKVIKRAHSKTKKNVIATENCISAIGKILKFKPITCVNVD 
E VLPHWLS WL PLHEDKEEAI QTL S FIjCDL I ESNHP WI GPNNSN 

LPKIISIIAEGKINETINYEDPCAKRLANWRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


6892 
- <?8$3 


3 


| 876 


R S VAAASG PGAWGTDH Y C LELLRKRD YEGYLCS LLLPAE SR SS V 
FALRAFNVEI*AQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVA1 ELWKAVKRHNLTKRWLMKI VDEREKNLDDKAYRN I KELE 
NYAENTQSSLLYliTLE I LGI KDLHADHAASH I GKAQG 1 VTCLRA 
TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDXNVRDVIYDIA 
SQAHLHLKHAR S FHKT VP VKAFPAFLQTVS LEDFLKK I QR VDFD 
IFHPSLQQKNTLLPLYLYIQSWRKTY 




1 


842 


DGERKSMSVERTFSEINKAEEQYSLCQELCSEIiAQDLQKERIiKG 
RTVTtKLKNVNFEVKTRASTVSSWSTAEEIFAIAKELLKTEID 
ADFPHPLRLRLMGVRISSFFNEEDRKHQQRS 1 I GFLQAGNQALS 
ATECTLEKTDKDKFVKPLEMSHiaCSFFDKKRSERKWSHQDTFKC 
EAVN KQS FQTSQPFQ VLKKKMNENLE I SENSDD CQ I LT C P VC FR 

AQGC1SLEALNKHVDECLDGPSISBNFKMFSCSHVSATKVNKKE 
NVPAS SLCEKQD YEAH 


6894 


1742 


1463 


TTLCKPIjVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 

DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895 
6896 


2379 


478 


VTYVELCDIASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLLALL YALASHKACKLAI LHL INGT I KG DE RYAE I FQDLLAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALlLPSSSEGSISEL 
EQLSNSLPNKELMTS I CD CLIiATLANSE SS YNCLLTCVRTMM FL 
AEHDYGIiFHLKSSLRKNSSAIiHSLLKRWSTFSKDTGELASSFL 
E FMRQ I LNSDT IGCCGDDNGLMEVEGAHTSRTMS INAABLKQLL 
QS KEES PENIiFLELEKLVLEHSKDDDNLDSLLDSWGLKQMLES 
SGDPLPLSDQDVEPVLSAPESLQWLFNNRTAYVLADVMDDQLKS 
M WFTP FQAEE I DTDLDLVKVDL I ELS BKCCSDFDLHS ELERS FL 

SEPSSPGRTKTTKGFKLGKHKHETFITSSGKSEY1EPAKRAHVV 
PPPRGRGRGGFGQG I RPHDI FRQRKQNTSRPPSMH VDDFVAAES 
KE WpQDG I P PPKRPLKVS QK I S SRGGFS GNRGGRGAFHSQNRF 

FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PLPPLRPLSSTGYRPS PRDRA SBf5Rf5f5T^3i><5 W7ii c ivm cr» er^z-n^-o 

KFVSGGSGRGRHVRSFTR 




1 


555 


GN I VIQKKKYNKQH 1 1 PLENVT I DS I KDEGDLRNG WL IKTPTKS 
FAVYAATATEKSEW^1^INKCVTDLLSKSGKTPSNEHAAVWVPD 
S EATVCMRCQKAKFT P VNRRHHCR KCG FWCG PCS E KRFLLPS Q 

SSKPVRICDFCYDLLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 

1 


GDGLMHEVWGJ^ERPDWETAlQKPI^SLPAGSGNAI^ 
AGYEQVTNEDLLTNCTLIiLCRRI»LSPMNLLSLHTASGLRLFSVL 
SIiAWGFIADVDLESEKYRRI/SEMRFTLGTFIJlIiAALRTYRGRIiA 
5fLP VGRVGS KTPASP WVQQG P VDAHLVPLEEP VPSHW TWPDE 
VL VLALLHS HZjGSEMFAAPMGRCAAG VMHL F YVRAG VSRAML 
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SEQ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 

1 to first 
amino acid 
residue of 
amino acid 

J sequence 


-Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acia 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N^Asparagine , 
P^Proline, Q=Glutamine, R«Arginine, 
S= Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




I 919 




LRLFIAMBKGRHMEYECPYLVYVPWAFRLEPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


S 120 


346 


QKTVTAVASLIiKGRQGIYTENERRMGAVIKIRFFKIMLVLIICW 
LSNIINESLLFYLEMQTDINGGSLKPVRTAAKTTWFIMGILNPA 
QGFLLSLAPYGWTGCSLGFQSPRKEIQWESLTTSAAEGAHPSPL 
MPHENPASGKVSQVGGQTS DEALSMLS EGSDAS TI E I HTAS ES C 
NKNEGDPALPTHGDL 


6899 




827 


MKVRKKWDAYbLDKNKINMDCFISCFFKKr^TTLMFSHSGILSL 
LEHGEEYTFSIiPCAYARSirjTVPWVELGGKVSVNCAKTGYSASI 
TFHTKP FYGGKLHRVTAEVKHNI TNTVVCRVQGEWNSVLEFTYS 
NGETKYVDLTKLAVTKKRVRPLEKQDPFESRRLWKNVTDSliRES 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYF1KEGDGWVY 
HKPLWKIIPTTQPAE 


6900 


3 


451 


TE VLGS KG XHE DRS S TSALHHAIiE ES ASLLTM FWRAAL PS THI P 
VLPGKVGESTERELLEI.RTKVSQQEQI,LQSTTEHLKNANQQKES 
MEQFIVSQLTRTHDVLKKARTNLEVRKbLHQSEAPSLSPTHHHP 
LADLVGDSWPAIjRFQEK 


6901 


1 


201 


DDNMVQRLETX3FKMTLC^SrLEQWAAWl^NVMMQALKPYEGRP 
SFPKAARQFLLKWSFYRYHLGFS 


6902 
C903 


! 2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSALEGQAGAQGA ' 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 




i 


14 9 


RINQVYRQGPTGIHILVIDQMVQNFQDESCFLFSTVKAE^flbGI 
HULK 


6904 
6905 


464 


2092 


MKASLPVSLSCVJUACGDVEGKFDILFNRVQAIQKKSGNFDLbLC 
VGNFFGSTQDAEWEE YKTGI KKAPIQTYVLGANNQETVKYFQDA 
DGCE LAENI T YJLGRKG IFTGS SGLQ I VYLSGTBS LNEP VPG YS F 
SPKDVSSLRMMLCTTSQFKGVDILLTSPWPKCVGNFGNSSGEVD 
TKKCGSALVSSIATGLKPRYHFAALEKTYYERLPYRNHIILQEN 
AQHATRFIALANVGNPEKKKYLYAFS I VPMKLMDAAELVKQPPD 
VTENPYRKSGQEASIGKQILAPVEESACQFFFDLNEKQGRKRSS 
TGRDSKSSPHPKQPRKPPQPPGPCWFCIiASPEVEKHIiWNIGTH 
CYLALAKGGLSDDHVLILP IGHYQS WE LSAEWE E VEK YKATL 
RRFFKSRGKWCWFERNYKSHHLQLQVI PVPISCSTTDDIKDAF ' 
ITQAQEQQIELLEIPEHSDIKQIAQPGAAYFYVBIiDTGEKLFHR 
I KKNFPLQFGRE VLAS EAI LNVPDKS DWRQCQI S KEDE ETIARR 
FRKDFEPYDFTLDD 


6906 


1 


226 


VSKTGEAETITSHYLFALGVYRT1,YLFNWIWRYHFEGFFI5LIAI — 
VAGLVQTVLYCDFFYLYXTKVLKGKKLSLPA 


6907 


3 


611 


SYDDHNGHIDFITAASNLRAKMYSIEPADRFKTKRIAGKIIPAI 
ATTTATVS GL VAliEM I KVTGG Y P FE AYKNWFLNIAI P I WFTE T 

TEVRKTKIRNGISFTIWDRWTVHGKEDFTLLDFINAVKEKYGIE 
PTMWQGVKMLYVPVMPGHAKKLKtiTMHICLVKPTTEKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYFSHDTD 




2 


2228 

1 


LRG VP VWAAGAFR FS SGEES TS HL I MS RRSQR LTR YSQGDDDGS " 

SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNMKRLSPAPQliGPSS 

DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 

GGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSDVDQQSS 

S S RLRS AVS RAGS LLWMVATS PGRLFRLL YWWAGTT WYRLTTAA 

SLIiDVFVLTRRFSSLKTFLWFLLPLLLLTCliTYGAWYFYPYGLQ 

TFHPALVS WWAAKDS RRAOEG WEARDS S PHFQAEQR VM SR VHS L 

ERRLEALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHED 

riALLEGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDSE 

DLFKKI VRASQESEAR I QQLKS EWQSMTQES FQESSVKELRRLE 

DQIAGLG^EIiAAliALKQSSVAEEVGLLPQQIQAVRDDVESQFPA 

aSQFLARGGGGRVGLLQREEMQAQLRELESKILTHVAEMQGKS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=sHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
Se Serine , T« Threonine , V= Valine , 
W*Tryptophan, Y=Tyrosine, X=Unknown # *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AREAAAS LS liTLQKEGVIGVTEEQ VHH I VKQAIiQRYSEDR I GLA~" 
D YALESGG AS V I S TRCS ET YETKTALLSL FG X PLWYHS QS PR V I 
LQ PDVHPGNCWAFQGPQG FAWRL SARIRPTAVTLEHVPKALS P 
NSTISSAP KDFAI FGFDEDLQQEGTLLGKFTYDQDGEP I QTFHF 
QAPTMATYQWBLRIIjTNWGHPEYTCIYRFRVHGEPAH 


6908 


3 


780 


QVPSAAWLMAVCGLGSRLGLGSRLGLQGCFGAARLIjYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELWEKGHTFAEELQKIQCTLQDV 
GS ALATP CS S ARE AHL KYTTFKAG PI LELEQW I DKYTS QLP PLT 
AFI LPS GGKI SS ALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
SP YLRGT I KMMQAVRQAFQDQDDRRTWDGRPLTMAAT FDDCLYA 
LCWDT I KRS S QTGEWQN I AIMTEEP ELS PAY L I S EAMRRS RMS 
LYC 


6910 


1 


10^8 


LVPVWIDSYYYGKLVIAPLNIVLYNIFTPHGPDLYGTEPWYFY 
L I NG FLNFIfVAFAIiALLVLPLTSLMEYLLQRFHVQNIiGHPYWLT 
LAPMYI WFI I FFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWLALGTVFLFGLliSFSRSVA 
LFRG YHGPLDL Y PE FYR IATDPT IHTVPEGR P VNVCVGKE W YRF 
PSSFLLPDNWQLQFIPSEFRGQLPIOPFAEGPLATRIVPTDMNDQ 
NIjEE PSRYID I S KCHYLVDLDTMRETPREPKYSSNKEBW I S LAY 
RP FLDASRS S KLLRAF YVP FL SDQ YTV YVNYT I LKPRKAKQ IRK 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLIS I FGSSFSGLLRKSPGGGREEEEGEESG 
PE AAE PGQ I CCDKPVLRDMNPWSTA I VAF 


6912 


1 


844 


AMKP VETHS FQMLFTI LSTGSALKAQS YEDAYRCI KSS I LLCS I 
SGGTD 1 1 S CFMGHNFS L P VYKG E I QARNLGMAVEAWNEEG KAVW 
GESGELVCTKP I PCQPTHFWNDENGNKYRKAYFS KFPGI WAHGD 
YCR INPKTGG I VMLGRSDGTLNPNGVRFGS SE I YNIVES FEE VE 
DSLCVPQYNKYREERVILFLKMASGHAFQPDLVKRIRDAIRMGL 
SARHVPSLILETKGIPYTLNGKKVEVAVKQI IAGKAVEQGGAFS 
NP ETLDLYRD I PELQGF 


6913 


1643 


. 1S58 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 
CINNAGLARPDTLLSGSTSGWKDMFNVNVLALS ICTREAYQSMK 
ERNVDDGHI INI NSMSGHRVLPLS VTHFYSATKYAVTALTEGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVIYVLSTPAHIQIGDIQMRPTEQVT 


6915 


254 


652 


GRSLS FKTFLI WVLISI YQGGILMYGALVLFESEFVHWAI SFT 
AL I LTE LLM VALT VRTWH WLMWAEFIiSLGCYVS S LAFLNE Y FD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLSPPSYCKLAS 


6916 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
AL ILT ELLMVALT VRTWHWLMWABFLSLGCYVS S LAFLNE YFD 
VAFITTVTFL WKVSAITWSCLPLYVLKYLRRKLS P PS YCKLAS 


6917 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVSS LAFLNE YFD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLSPPSYCKLAS 


6918 


28 


921 


P EAGTR S WRE PD P EDLRRFLLS AACRS FPQWLPGGGGGQVS S CS 
DTDVP YLLLAVKS EPGRFAERQAVRETWGSPAPGI RLLFLLGS P 
VGEAG PDLDS LVAWESRRYS DLLLWD FLDVPFNQTLKDLLLLAW 
IjGRHCPT VSF VLRAQDDAFVHTPALLAHLRAL PPAS ARS LYLG E 
VFTQAMPLRKPGGPFYVPESFFEGGYPAYASGGGYVIAGRLAPW 
LLRAAAR VAP FP FED VYTGLCI RALGLVPQAH PGFLTAW PAD RT 
ADHCAFRNLL LVR P LGPQAS IRLWKQLQDPRLQC 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine , D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
r-rtoiine, y-bAutamxne, K=Arg mine , 
S=Serine, T«Threonine, V» Valine, 
^Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6919 
6920 


850 
1418 


41 
591 


gGRRELSGSVFCPFIQQEPKEMLTLSEYHERVRSQGQQLQQLQA " 

E LDKLHKE VS T VRAANS ERVAKLVFQRLNE DF VRKPD YALS SVG 

AS I DLQKTSHD YADRNTAYF WNR FS FWNYARPPTVILE PHVFPG 

N CWAFEGDQGQ WI QLPGRVQLSD I TLQH P P P S VEHTGGANSAP 

RDFAVFFLLSFFTHQGLQVYDETEVSLGKFTFDVEKSEIQTFHL 

QNDPPAAFPKVKIQILSNWGHPRFTCXjYRVRAHGVRTSEGAEGS 
AQGPH 

EAQGPSKVHLTLKKKK 


6921 


2 


1711 


mnatrseeqfhviniu^qtlrkmenylkekqlctvll'iaghlri 

PAHRLVLSAVSDYFAAMFTNDVLEAKQEEVRMEGVDPNALNSLV 
0 YAYTG VIjQLKEDT I E S LIiAAACLLQLTQ V I D VCSNFL I KQLHP 
SNCLG IRS FGDAQG CTELLNVAHK YTMEHF I E VI KNQE FLLLPA 
NEISKLLCSDDINVPDEETIFHALMQWVGHDVQNRQGELGMLLS 
Y I RL P LLP PQLLADIiE TS SM FTGDLE CQKLLMEAMK YH LLP E R R 
SMMQS PRTKPRKST VGAL YAVGGMDAMKGTTT IEKYDLRTNSWL 
HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 
I WT VM PPMS THRHGLGVATLEG PM YAVGGHDG WS YLNTVER WD P 
EGRQWNY VAS MST PRS TVG WALNNKL YAI GGRDGS S CLKSME Y 
FDP HTNKWS LCAPMS KRRGG VG VAT YNGFLY VVGGHDAPASNHC 
SRLSDCVBRYDPKGDSWSTVAPI>SVPRDAVAVCPLGDKLYWGG 
YDGHTYI/NTVESYDAQRNEWKEEVPVNIGRAGACVVVVKLP 


6922 


1075 


369 


LTPPAGIRHEVRDREREREREREREKFPLDSTGSELKQNIHSIT 
GLPPAMQKVMYKGIAPEDKTLREIKVTSGAKIMGGGSTINDVLA 
VNTPKDAAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 
KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGS I KNWS E PI EGHEDYHMMAFQLGPTEAS YYWVYWVPTQY 
VDAI KDTVLGKWQYF 


6923 


2469 


1660 


LGL FCI LP I DTLCAVLE RDTLS I RESRIjFGAWRWAEAE CQRQQ ~ 

LPVTFGNKQKVLGKALSLIRFPLMTIEEFAAGPAQSGILSDREV 

VNLFIiHFTVNPXPRVEYIDRPRCCLRGKECCINRFQQVESRWGY 

SGTSDRIRFTVNRRISIVGFGLYGSIHGPTDYQVNIQIIEYEKK 

QTLGQNDTGFSCDGTANTFRVMFKEPIEILPNVCYTACATLKGP 

DSH YGTKGLKKWHETPAAS KT VFFF FSS PGNNNGTS I EDG Q I P 

EIIFYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGALAKKPYNPIIGETFHCSWEVP 
KDR VKP KRTASRS PAS CHEH PMADD P S KS YKLRFVAEQ VSHH PP 
ISCFYCECEEKRLCVNTHVWTKSKFMGMSVGVSMIGEGVLRLLE 
HGEEYVFTLPSAYARS ILTI PWVELGGKVSINCAXTG YSAT VI F 
OTKPFYGGiCVHRVrAEVKHNPTNTIVCKAHGEWNGTLEFTYNNG 
ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVTRYLRLGDI 
DAATEQKRHLEE KQRVEER KRENLRT PWKPKYF I QEGDGS G I LQ 
SPLESTLMGLEVQSFPV 


6925 
6926 


2 
1 


1653 
733 


RGGAAGAAM BP D S VIED KT I E LM CS VPRS LWLG CANLVESMCAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 

DOWSES DO VP* TTVPlJT T CDMPU VAUnit TMfi «t rr-rx+w-r ^ _ 

^w" 0 ^ y wvw v c.nux bKMLM YQHGH INS YLKPMLQRDF I TALP 
EQGLDH I AENI L S YLDARS LCAAELVCKEWQRVI S EGMLWKKL I 
ERM VRTDPLWKGLS ERRGWDQ YLFKNRPTDGP PNS F YRS L Y P KI 
IQDIETIESNWRCGRHNLQRIQCRSENSKGVYCLQYDDEKIISG 
LRDNS I KIWDKTSLECLKVLTGHTGSVLCljQYDERVI VTGSSDS 
TVRVWD VNTGBVLNTL IHHNEAVLHL R FSNGLMVTCS KDRS IAV 
WDMASATDI TLRRVLVGHRAAVNWDPDDKYI VSASGDRTI KVW 
STSTCEFVRTLNGHKRGIACLQYRDRLWSGSSDNTIRLWDIEC 
SACLRVLEGHE ELVRCIRFDNKR I VSGAYDGK I KVWDLQ AALD P 
RAPAS TLCLRTLVEHSGR VFRLQFDE FQ 1 1 S S SHDDTI L I WDFL 
ffVP PSAQNETRS PS RTYTY I S R 

SGRVAMDGLGLQFPEQGFPAGPPLI.PPHMGGHYRDCQSIiGAPPL ~ 
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JEQ J Predicted 
ID beginning 
NO? | nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6927 



6928 
6929 



1086 
'1749- 



131 



predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1484 



Amino acid segment containing signal peptide" 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, l=*Isoleucine, K=Lysine, 
L^beucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, ^Threonine, V=Valine, 
W^Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
N^poasible nucleotide insertion) 



777 



DGYPLPTPuTgPbDGVDPDPAFFAAP MPGDCPAAGTYSYAQVSD " 
YAGPPEPPAGPMHPRLGPEPAGPS I PGLLAPPSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHH P PG PGQ P TP P PEALP CRDGT 

DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVN^PD 
SHGAI SS WS DASSAVY Y CNYPDV 

| JjTLCGDI QLMIiAQNANNRAAHLE E FH YQTKE DQE I LKS LHRES S 

CQGFAWATDLSTDLESQLSVSCKCYEAANEILQFRDLKSQWPEH 

yvqvlkrmgnirneigvfymnqaaalqserlvsksvsaaeqqlw 
I kk sfscfekgihnfesiedatnaalllcntgrlmricaqahcga 
gdelkrefspeeglyynkaidyylkalrslgtrdihpavwdsvn 

WELSTTYFTMATLQQDYAPLSRKAQEQIEKEVSEAMMKSIiKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLHYSKAAKLFQLLKDAPCELLRVQLERVAFAEFQMTSQNS 
NVGKIiKTLSGALDIMVRTEHAFQLIQKELIEEFGQPKSGDAAAA 
ADASPSLNREEVMKLIiSIFESRIiSFLLLQSIKLLSSTKKKTSNN 
[ I EDDT I LKTNKH I YSQLLRATANKTATLLE R INVI VHLLGQLAA 
GSAASSNAVQ 



EAI-UIjXWNIjIjQVKMRKRYSVDKTLS HPWIjQDYQTWLDIjRELECK 
I GER YI THES DDLRWEK YAGEQGLQ YPTHL INPS A3HSDTP ETE 
ETEMKALGBRVSIL 

RDQRGYRDDRSPAREPGDVSARTRS GGGGGRSATTAMPPPVPNG 
NLHQHDPQDLRHNGNWVAGRPSCSRGPRRAIQKPQPAGGRRSG 
RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHIAGLQFRE 
QEVRNQGQARTNSTSAQK1TORESIRQICLALGSFFDDGPGIYTSC 
SKSGKPSLSSRLQSGMNLQICFVNDSGSDKDSDADDSKTETSLD 
TPLS PMS KQS S S YSDRDTTE E ESESLDDMDFLTRQKKLQAEAKM 
ALAMAKPMAKMQVEVEKQNRKKSPVADLLPHMPHISECLMKRSL 
KPTDIjRDMT IG QLQVI VNDLHS Q IESLNEELVQLIiIi I RDELHTE 
QDAMLVPI EDLTRHAES QQKHMAE KMPAK 

FKDTANVFVSIiFQMRNNFRHYFIEP SQLKLFYDVITWIVTQVAI 
SYTWPFVLLSIKPSLTFYSSWYYCLHILGIJCVLLLLPVKKTQR 
RKNTHENIQLSQS KKFDEGENSLGQNS FSTTNNVCNQNQE IASR 
HSSLKQ 



607 



545 



1431 



3030 



659 



FVERLPNRPAOJLVASG^EGVSAQSFIjHCFTMASTAFNLQVAT 
PGGKAMEFVDVTESNARWVQDFRLKAYASPAKLES IDGAR YHAIj 
LIPSCPGALTDLASSGSLARILQHFHSESKPICAVGHGVAALCC 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVKDSG 
ACFSASE PDAVHWIjDRHLVTGQMASSTVP AVQNLIiFLCGSRK 
yVDSPGQGaQAEEEEGGIQMNSRMRA HSPAEGASVESSSPGPKK 
SDMCEGCRSIAAGHPGYISHDKETSIKYVSHQHPSHPQLFSIVR 
QACVRSLSCEVCPGREGPIFFGDEQHGFVFSHTFFIKDSIiARGF 
QR W YS 1 1 TI MMDR I YL I NS WPFLLGKVRG I IDELQG KALKVFEA 

EQFGCPQRAQRMNTAFTPFLHQRNGNAARSLTSLTSDDNLWACL 
HTSFAWLLKACGSRLTEKLLEGAPTEDTLVQMEKLADLEEESES 
WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTL 
KTLQEVTDSLLGGWLMAQGVGGI I 

■ SLNLHCTLPyFPHQYPAGY PSDKEGKKPKG^KKQPSGTTKRPI 

I SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIREDCQNQKLW 

DEVLSHLVEGPNFLKKIiEQSFMCVCCQELVYQPVTTECFHNVCK 

I DCLQRSFKAQVFSCPACRHDLGQNYIMIPNEXLQTLLDLFFPGY 
SKGR 

, DRDHSQC^GiKRVAIJ^VS SViU.ISKAKIR^KMTFlIVIAFIV ' 
| CWTPFFFVQMWSVWDANAPKBASAFI I VMLLASLNSCCNPWI YM 
| L FTGHL FHEIiVQR FLCCSAS YLKGRRLGETS AS KKS NS S S FVLS 
HR3 SSQRSCSQPSTA 



'1131 



890 



2588 
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SEQ 
ID 
NO: 


freaicted 

beginning 

nucleotide 

location 

corresponding 

to first 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, c«=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K*=Lysine, 
li*Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , v=valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6935 


886 


543 


NSALYVAGGNDGTS CLNSVBRYS PKAGAWES VAPMNI RRSTHDL 
VAMDGWLYAVGGNDGSSSLNSIEKYNPRTNKWVAASCMFTRRSS 
VGVAVLELLNFP PPS S PTLS VSSTSL 


6936 


1347 


567 


RSHRRQFLSRALLEFFGKSHPPPHRLFRKSLNVGLHYSHIPFLT 
TCLHFLRICRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRALLQLRGLDPS 
LPS PL PNLG PQG PALTP EQEN I LHTTQTD C YNNLAACLLQME P V 
NYERVREYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYLL 
AAVNRQPKDANVRRYLQLTQSELSSYHRKEKQLYLGMFG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDY 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PCPPLEERAGCLEYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TSPHWSTHT3DAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVCVDCQPPAMNSVSLRCSGDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


NSRKLELAERVDTDfTIQLKKRRQSSEBCENDSGTLDTVGAWVDH 
EGNVAAAVSSGGLALKHPGRVGQAALjYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTILARECSHALQAEDAHQALLETMQNKFISS 
P FLASEDGVLGGVI VLRS CRCSAEPDSSQNKQTLLVE FLWSHTT 
ESMCVGYMSAQDGKAKTHISRLPPGAVAGQSVAIEGGVCRLGEP 
SELTLQAECEASQRHFRT 


6939 


3 


810 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS '"' 
GYESLRRDSEATGSASSAPDSMSESGAASPGARTRSLKSPKKRA 
TGLQRRRLI PAPLPDTTALGRKPS LPGQWVDLPPPLAGSLKEPF 
E I KVYEIDDVERLQR PRPTPRE APTQGLACVSTRLRLAERRQQR 
LREVQAKHKHLCEELAETQGRLMLE PGRWLEQFE VDPELE PES A 
E YLAALERATAALEQCVNLCKAHVMM VTC FD I S VAAS AAI PG P Q 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERA3DQASFTTSMEWDTQ 
WKGSS PLGPAGLGAEE PAAG PQL P S WLQPERCAVFQCAQ CHAV 
LADS VHLAWDL SRSLGAVVFSR VTNNVVLE AP FL VGI EGS LKGS 
TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELXEK1VLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI I GSNVLALAEAQRQAEALGYQA 
WLSAAKQGDVKSMAQ FYGLLAHVARTRLTPSMAGAS VEEDAQI. 
HELAAELQ I P DLQLE EALETMAWGRGP VCLLAGG EPTVQLQGSG 
RGGRNQE LALRVGAE LRRW PLG P I DVIjFLSGGTDGQDG PTEAAG 
AWVTPELASQAAAEGLDIATFLAHNDSHTFFCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GD YVER YD P KTDTWTMGAPLSM PTNAVGG C LLGDRL YADGG YDG ' " 
QTYLNTMES YDPQTNE WTQMASIiNIGRAGAC WVI KQP 


6943 


1 


739 


PMATGDGAKTLAIHVKALTADSIRITWKATLPASSFRT Qvn prr 
HS PAGGS I TETLVQGDKTEYXiLTALEPKPTYI ICMVTMETTNAY 
VADETP VCA KAE TADS YG PTTTLNQEQNAGPMAS LPLAG 1 1 GGA 
VAL VFLFLVLGAI CW YVHQ AGELLTRERAYNRG S RKKDDYMESG 
TKKDNSILEIRGPGLQMLPINPYRAKEEYWHTIFPSKGSSLCK 
ATHTIG YGTTRGYRDGG I PDIDYSYT 


6944 


960 • 


IS* 


VANILLNGVKYESELTGSSERAEQPLSVGRLCSTICNMPKALRT 
LCVNHFLGWLSFEGMLLFYTDFMGEWFQGDPKAPHTSEAYOKY 
NSGVTMGCWGMCI YAFSAAF YS A ILEKLEEFLS VRTLYFI AYLA 
FGLGTGLATLSRNLYWLS LCITYGI LFSTLCTLPYSLLCD Y YQ 
SKKFAGSSADGTRRGMGVDISLLSCQYFLAQILVSLVLGPLTSA 
VGS ANG VM Y FS SLVSFLG CLYS SLFV I YE IP PS DAADEEHRPLL 
LNV 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N~Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S*=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


€945 
6946 


2067 


179 


EGEDRGDPRTMGAALGTGTRIAPWPGRACGALPRMTPTAPAQGC 
HSKPGPARPVPLKKRGYDVTRNPHLNKGMAFTLEERliQLGIHGIj 
IP P C FLS QDVQLLR I MRYYE RQQS DLDKYI I LM TLQDRNE KLFY 

RVLTSDVEKFMP I VYTPTVGIiACQHYGLTFRRPRGLFIT IHDKG 
HIATMI^SWPEDNIKAVWTDGERILGLGDLGCYGMGIPVGKLA 
LYTACGGVNPQQCLPVLLDVGTNNEELr^DPLYIGLKHQRVHGK 
A YDDLLDE FMQAVTDKFG I NCLIQFEDFANANAFRLLNK YRNK Y 
CM FNDD I QGTAS VAVAG I LAALRITKN KLSNHVFGFQGAGEAAM 
G \ IAHLLVMALE \ KEGVPKA\ EATRKI W\MVDF \KGL I VQGRDH 
LNHEKEMFAQD\HPEVNSIiEEWRLVKPTAlIGVAAIAEA\FTE 
QILRDMASFHERP\IIFALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS\GSPF*GVLIWEMGKTFIPGGRGNNA*RVPRGWQLGVHSPG 
GDPGHIP\DEIFLPDSRAKLPQEVSEQHLSQGRLYP\PLST\IR 
NVFLRIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDS Y I WAQGKAMNVQTV 




133 


2551 


SCEYSGlTVAPGDPCPGVAHLIiAPSMASDTPBSLMALCTDFCIiR 
NLDGTLGYLLDKETLRLH PD I FLPS E I \CDRLVNE Y VELVNAAC 
NF\EPHE\SFPNPLFRDPRKQPASRRIHL\RED\LVQD\QD\LE 
AI RKQDL\ VE L \ YLTN \ C E KLS AKS LQTLRSFSHTLGVP * AFFG 
C\TNILLLRKENPGGL/CEDEYLFNPTCQVLVKDFTFEGFSRLR 
F\ LKLGRM I D WVP VES \LLRPLNSLAALDLSGIQTSDAA\ FLTQ 
WKDSL\VSLVL\YNMDLSDDHIR\VIVQLHKLRHLDISRDRLSS 
YYKFKLTREVLSLFVQKIiGNIjMSLDISG\HMILENCSISKIGKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 

ipaykvsgdkneeqvlnaieaytehrpeitsrainllfdiarie 

RCNQLLRALKLVITALKCHKYDRNlQVTGSAALFYliTNSEYRSE 

qsvklrrqviqwlngmesyqevtvornccltlcnfsipeelef 

QYRRVNELLLS ILNPTRQDES IQRIAVHLCNALVCQVDNDHKEA 
VGKMG F WTMLKIiTQKKIjLDKTCDQVMEFS W \ SAL WNI TDETPD 

ncemflnfngmklfldclnefpekqelhrnmlgllgnvaevkel 
rpqlmtsqfisvfsnlleskadgievsynacgvlshimfdgpea 
wgvcepqreeveermwaaiqswdinsrrninyrsfepilrllpq 

GISPVSQHWATWALYNLVSVYPDKYCPLLIKEGGMPLLRDIIKM 

atarqetkemarkviehcsnfkeenmdtsr 


6947 
6948 


2 


1682 


TSVSTIPRGLASARPQSRSWRCCPVWRRSPGRARGRGLKMLNVP 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 

delmraagsdgtelfdqvhrwvnyesmlkeclvgrmai kpavlk 

DYREEEKKVLNGMLPKSQVTDTLAKEGPSYPSYDWFQTDSLVTI 

/ehiy*tegyqfrlnns*sse*flysrnny*gllisytyw/r*a 

MRFRKIFIiCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSL 

iprkdtglyyrkcqliskedvihdtrlfclmlppsthlqvpigq 
hvylklp itgte i vkp ytpvsgslls e fkepvlpnnkyi yfli x 

IYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQELEDLFLLA 
AGTGFTPMVKILNYALTDI PSLRKVKLMFFNKTEDDI I WRSQLE 
KLAFKDKRLDVEFVLSAP I S EWNGKQGHI S PALLS E FLKRNLDK 
SKVLVCI CGPVPFTEQGVRLLHDLNFSKNEIHSFTA 


6949 


104 


58 

< 


PDGAHSFFPDEYFTCSSLCLSCGVGCKKSMNHGKEGVPHKAKSR^ 

CKYSHQyDNRVYTCKACYERGEEVSWPKTSASTOSPWMGLAKY 

AWSGYVIECPNCGWYRSRQYWFGNQDPVDTWRTEIVHVWPGT 

DGFLKDNNNAAQRLLDGMNFMAQSVSELSLGPTKAVTSWLTDQI 

APAYWRPNSQILSCNKCATSFKDNDTKHHCRACX5EGFCDSCSSK 

rRPVPERGWGPAPVRVCDNCYEAR/TRPVSCYRGTSGR * RRRRT 

3ETVE 




152 


46S6 ( 

< 


^RLCLSRPLTRPGDDSVGGaAMASG^GVUJGGGGKIRTRRCH " 
3GPIKPYQQGRQQHQGILSRVTESVKNIVPGWI<QRYFNKNEDVC 
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SEQ 
ID 
NO: 



Predicted 

beginning 

nucleotide 

location 

cor re sp ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A~Alanine, C«Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«-Isoleucine, K=Lysine, 
L*Leucine, M^Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=»Dnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 



"6950" 



"2585- 



411 



1940 



239 



acSTDTSBVPRWPENKEDH LVYADEESSNJTDGRITPEPAVSNT 
EEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 
SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGPSSRASDKDIT 
VS KNTSLP PLWS PEAERS H5 LSQHTATSSKKPAFNLS AFGTLS P 
SLGNS S I LKTSQLGDS P F YPGICTT YGGAAAAVRQSKIiRNTP YQA 
PVRRQMKAKOLSAQSYGVTSSTARRILQSLEKMSSPLADAKRIP 
S I VS S PLNS P L DRSG I D I TD FQAKRE KVDSQ Y P P VQR LMTPK PV 

SIATNRSVYFKPSLTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 
REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 
LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 
QALTNKVQMTS PSSTGSPMFKFSS PI VKSTEANVLPPSS IGFTF 
SVPVAKTAELSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 
EGPFRPAEILKEGSVLDILKSPGFASPKIDSVAAQPTATSPWY 
TRPAISSFSSSGIGFGESLKAGSSWQCDTCLLQNKVTDNKCrAC 
QAAKLSPRDTAKQTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 

wdcdtclvqnkpeaikcvacetpkpgtcvkraltlWvsesaet 

MTAS S S S CT VTTGTLGFGD K FKR PIGS WECS VC C VSNNAE DNKC 
VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 
ELCLVQNKADSTKC1ACESAKPGTKSGFKGFDTSSSSSNSAASS 
SPKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP 
MSEGF*FSKHIVGFKFGVSSESKPEEVKKOSKNDNFKFGLSFGL 
SNPVFLTPFQFGVSNtiGQEEKKEELLKSSCAGFRFGTGVINSTR 
VPANT I VT3ENKSS FNLGTI ETKS VS VAPLKCQTS EAKKEEMPA 
TKGGFSFGNVBPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 
KLTMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 
SEQ PAKATFAFGAQTNTTADQGAAKPDLS YLNNS SS S SSTPATS 
AGGG \ I FG SSTS S S NPPVATFVFGQS SNPGSS S \AFGNTAES ST 
SQSLLFSQDSKLATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 
ATTTSSSAGSSFVFGTGPSAPSASPAFGANQTPTFGQSQGASQP 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
S AFGSGTT PNS S S AFQFG SSTTNFNFTNNS PSG VFTFGANS S TP 

AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRK 



FKPGSRSGLC'RRAGERGAV RAGGliSRRTRAE * IMDELHYQDTDS " 
DVPEQRDS KCKVKWTHEEDEQ LRALVRQ FGQQDWKFI*ASHFPNR 
TDQQCQYRWLRVLWPDLVKGPWTKEEDQKVIEIiVKKYGTKQWTL 
I AKHLKGRIiGKQCRERWHNHLNPE VKKSCWTEEEDR 1 I CEAHKV 
LGNRWAEIAKMIiPaRTDNAVKNHWNSTIKRKVDTGGFLSESKDC 
KPPVYLLLE LEDKDGLQS AQP TEGQGS IiLTNWPS VP PTI KE E EN 
S EE ELAAATTS KEQ E P IGTD LDAVRTPEPLE E FPKREDQEGS P P 
ETSLPYKWVVEAANLLrPAVGSSLSEALDLIESDPDAWCDLSKF 
DLPEEPSAEDSINNSLVQLOASHQCX}VLPPRQPSA\LVPSVTEY 
RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 
RR VALS P V TEUSTSLS FLDS CNS L TP KS TP VKTLPFS PS Q FLNF 

WNKQDTI^LESPSLTSTPVCSQKVVVTTPLHRDKTPLHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLEEDIjKE 
VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 
DEDMKLMMSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNCXSF 
LQAKPE KAAVAQKPRSHFTTPAPMSS AW KTVACGGTRDQLFMOE 
KARQLLGRLKPSHTSRTLILS 



AGPDDTMKRSLQAI J YCQLLSFl,LIIiALTE ALAFAmKPSPRESL " 

QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 

TPTPRAEGHPPT\TPSPPSLRQ*PPPirJCAP/sSTGPAPAAMAT 

TSSKPEGRPRGQAAPTILLTKPPGATSRPTTAPPRTTTRRPPRP 

PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 

LGKIFQIYKGKrFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, j 
L=Leucine, ^Methionine, N=Asparagine, ! 
P=Proline, Q=Glutamine, R=Arginine, ) 
S=Serine, T«Threonine, V=Valine, j 
W=Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, J 
\-possible nucleotide insertion) | 


£952 


i 658 




TVPSNTSWAPTTTSIjGPAKDKPGLRRAAQtjtaGSTFTSQGGTPDA 1 

TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 

S RPLSTS SGVFTAATGPTPAAFDTS VSAPSQGI PQGAS TTPQAP 

THPSRVSESTISGAKBETVA\PSP*PTGCPVLSPQWYPQPQAIS 

STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 

Zl^f PP \ LC PADGVLHBEEEEDRQPGEQPEAYC3NNTHHPGT 

TFQQAC\RGAAPGEIPVPLKPLRTQLSEPRSPANGDYRDTGMVP 


6353 






PSSEGESGEMTDRYTIHSQLKHLQSKVlGT\ATPTPPSGSG\CE- 
PTPRLVLLLHGPLRPSQLLRHCGE*EQSASPLQLDGKDASALWT 
ASRQARGELRLCLTTAVRGTSPSVSPVCQSS I 


6954 


1512 


349 


MW^KTRALASGKHVPFGKQTWPNKS/ VHCD3 * G"** RRE TTQDES 1 

FSPHFRGKMGGW\KI,EKBLENTEQPVGGNEG*EHEVTGNLNSD 

PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 

GRALSSPGSLGRHLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKDIAFSPPVYPAGILLVCNNCAA 

YRKLLEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 

E TP E EREVRRMRDREAKRLQRMQE TDEQRARRL.QR DREAMR hKR 

AIETPEKRQARLIREREAKRLKRRLEKMDMMLRAQFGQDPSAMA 
[ ALAAEMNFFQLPVSG VELDSOLLQKM AFRRQMQ cct.u | 


6955 


819 
19^8 


1 


PPPPFI I PSMPREAGT*AG * KRSGDSKCS PPVEQ »A*TRAAAQN 

* PQR * RWTEGNS PQAS AVATPGQGASPAAPRCTP * PSRRHRRLP 
PGARPPAG*AAPAPTKPWLAGPASAPQPGAAPLSPPAPPLIRTR 

* CAGAAARGR PRRDRS PRPRTPGGCS WSEPRTPPAVSASAQTPS 
DAG*AGGR*GQRQRPSTGR*PPGVGGAGRSHRREGTIPGNPHPR 

| ^*^WQR*PGP/REWGL*EPQGBBMSGPGGPGGAPPNQVGSS 


6956 




782 


Tjoon^^^ ^^ A ^GHWGTRAK^ V K1T>GKRRARK1T0PFIjGQD 
WRS PGWS W I KTEDG WKRCES CSQKLER ENNHCNI S HS 1 1 LNS ED 
GE I FNNEEHE YAS KKR KKDHFRNDTNTQS F YREKW I YVHKE STK 
ERHGYCTLGEAFNRLDFSSAIQDIRRFNYWKXLQLIAKSQLTS 
LSGVAQKNYFNILDKXVQKVLDDHHNPRLIKDIXQDLSSTLCrL 
/N*RSREVCISGKHQYLDLPIRNYSRLATTATGSSDD*ASE\NG 
LTLSDLPLHMLNKILYRFSDGWDIITLGQVTPTLYMLSEDRQLW 
KKLCQYHFAEKQFCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 
GDTLHFCRHCS I L FWKDSGHP CTAADPDS CFTP VS PQH FI DLFK 

1 




8605 


3839 

; 

y 
I 


y l bTS I FAS PTS P P VLGES VLQDNS FDLNNGSDAEQEEMETQS S 

DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLVVSPAAS 

PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 

a or» P VT S PAAAFPTAS PANKDVS S FLBTTADVEE ITGEGLT 

ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 

YYGPCGKRMKQFPEVIKYLSRNWHSVRREHFSFSPRMPVGDKF 

EBRDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 

P KVKRGRGRP PKVK I TELLNKTDNR PLKKLEAQETLNEEDKAK I 

AKSKKKMRQKVQRGECQTTIOX30ARNKRKQETKSLKQKEAKKKS 

KAEKEKGKTKQEKLKEKVKREKKEKVKMKEKEEVTKAKPACKAD 

KTLATQRRLEERQRQQMILEEMKKPTEDMCLTDHQPLPDFSRVP 

?^ SGAFSD ^ TIVEF ^ SFG ^ LGraPAKD VPSLGVLQEGL 

LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLKILGEKVSEI 

PLTRDNVSEILRCFLMAYGVEPALCDRLRTQPFQAQPPQQKAAV 

[AFLVHE LNGSTIi I INEIDKTLESMSSYRKNKWI VEGRLRRLKT 

/LAKRTGRSEVEMEGPEECLGRRRSSR1MBVTSGMEEEEEEESI 
\AVPGRRGRRDGEVDATASSIPELERQIEKLSKRQLFFRKKLLH 
JSQMLRAVSLGQDRYRRRYWVLPYLAGrFVEGTEGNUVPEEVIK 
^TDSLKVAAHASLNPALFSMKMEIAGSNTTASSPARARGRPRK 
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amino acid 
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Predicted end 
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location 
corresponding 
to first 
amino acid 
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82 



6958" 



274 



6959 



3514" 



1663 



Amino acid segment containing signal peptide"" 
<A»Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I^Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 

iK^^MgPKHLKSPVRGQDSiiQPQAQLQPKAQLHAPAQPQPQLQ- 
LQLQSHKGFLEQEGSPLSLGQSQHDLSQSAFLSWLSQTQSHSSL 
LSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSPQQLASSKPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 
FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKAIiHPR 
GIREKALHKHLNKHRDFLQEVCLRPSADPIFEPRQLPAFQEGIM 
SWSPKEKTYETDIiAVLQWVEELEQRVJMSDLQIRGWTCPSPDST 
RED LAYCEHLSDSQED I T WRGRGREGLA PQR KTTNPLDLAVMRL 
AALEQNVERRYLREPbWPTHBWLEKALLSTPNGAPEGTTTEIS 
YEITPRIRVWRQTLERCRSAAQVCLCLGQLERSIAWEKSVNKVT 
CLVCRKGDNDEFLLLCDGCDRGCHIYCHRPKMEAVPEGDWFCTV 
CLAQQVEGEFTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVLLR 
GRESPAAGPRYSEEGLSPSKRRRLSMRNHHSDLTFCEIILMEME 
SHDAAWP FIjE PVNPRLVSG YRR 1 1 KNPMDFSTMRERLXiRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 
YQGKQGQSVRQGRWGVTIiWHLPPTFQTKTCHFHLLMLPWVQTQV 

KiiX VAMPEPTKKEENEVPAPAPPPEEPSKEKEAGTTPAKDWTLV 
ETPPGEEQAKQNANSQLSILFIEKPQGGTVKVGEDITFIAKVKA 
EDLSEKPTINGSRKWMDIiASKAGKHLQLKETFERHSRVYTFEMQ 
1 1 KAKDNFAGNYR CE VTYKDKFDS CS FDLEVHES TG TTPNI D IR 
SAFKRSGEGQEDAGELDFSGIiLKRRJSVKQQEEEPOVDVWELLKN 
TKPSEYEKIAFQYESPTCSGMLKRLKRSIREEKKSAAPAKILDP 
VYQVDKGGRVR F WE bAJD P KLE VKWNKNGQELRP STKY I F E DTR 
CQS I LNI DNCQMTDDS E Y YVTAGDE KCS TELLVREPPIMVTKQI, 
EDTTD YCGER VELECE VS EDDAQVKWFKNGEE 1 1 LVQTR YR I R V 
EGKKH ILI IEGATKADAADYS VMTTGGQSS AKIjS VDIiKPI*KIIjT 
PLTDQTVNLGKEICLKCEISENIPGKWTKNGIiPVQESDRLKWH 
KGRIHKLVIDHALTEDEGDYVFAPDAYNVTLPAKVHVIDPPKII 
LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 
RKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYEVRIFAVNA\I 
GISKPSMPSRPFVPLAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGLDG YVLE YCFEGS TS AKQSDENGEAAYDliPAE DW I VANKD 
LIDKTKFTITGLPTDAKI FVRVKAVNAAGASEPKYYSQPI LVKE 
IIEPPKIHSPKHLKQTYIRRVGDRVILVIPFQGKPRPELTWKKD 
GAEIDKNQINIRNSETDTIIFIRKAERSHSGKYDLQVKVDKFVE 
TASID I R I IDRPGPPQIVKI EDVWGRNVALTWTPPKDDGNAAI T 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 
MCX5LSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 
VNRIiCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLEIRKPSPYDGGTYCCKAVNDLGTVE IECKLEVKVIAn 
! PRTSRVKTEGSQGSSAMDFSVKVDIEKEVTCPICliELLTEPLSir" 
! DCGHSFCQACITAKIKESVI I SRGES S CPVOQTRFQPGNLR PNR 
HLAN1 VERVKEVKMS PQEGQKRDVCEHHGKKLQI FCKEDGKVI C 
I WVCELSQEHQGHQTFRINEWKECQEKLQVALQRLIKENQEAEK 
[ kEDDIRQERTAWKNYIQIBRQKILKGFNEMRVILDNEEQRELQK 
I LEEGE VNVLDNIiAAATDQ LVQQRQDAS TLI S DLQRRLRGS S VEM 
LQDV2DVMKRSESWTLKKPKSVSKKLKSVPRVPDLSGMLQVLKE 
LTDVQYYWVDVMLNPGSATSKVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQ YFS SGKYYWE VDVSGKIAWILGVHSKISS INK 
RKSSGFAFDPSVWYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTI,FMAV\LPWLG FS 

[ ^^VHVVEFGRGIgDFPYliFFQLTHCX?QRI CSVTQAGVQWCDHSS 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine , R=*Arginine, 
S^Serine, T-Threonine , V=Valine, 
WaTryptophan, Y=Tyxosine, X^Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








LQPQT PGLNQSSHLS IiLSSRDYRMLSS FNEWFWQDRFWLPPNVT 
WTBLEDRDGRVYPHPQDLriAAIiPLALVLl^RtiAFERFIGLPLS 
RWLGVRDOTRRQVKPNATLEKHFLTEGHRPKEPQLSIiLAAQCGIi 
TliQQTQRWFRRRRNQDRPQLTKKFCEAS WRFLFYLS S FVGGLSV 
LYHESWLWAPVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLELG 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVILMTFSYSANLI^IGSLVLLLHPSSDYIXEACKMVNYMQ 
YQQVCDALFIiI FSFVFFYTRLVLFPTQILYTTYYES I SNRGPFF 
GYYFFNGLLMLLQLLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
EES DSSEE AAAAQ E PLQLKNGTAGG PRPAP TDGPRS RVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWAREKEMQEF\TRSFF\RGRPDLSTLTHSIVRRRY1AHSGRS 
HLEPEEKQALKRLVEEEPLKMQVDEAASRBDKIjDLTKKGKRPPT 
PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTHP 
KEENPRRA\SKAVEESSDEERQRDLPAQRGEESSEEEEKGYKGK 
TRKKPWKKQAPGKASVSRKQAREESEESEAEPVQRTAKKVEGN 
KGTKSLKESEQESEEEIIiAQKKEQREEEVEEEEKEEDEEKGDWK 
PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 
DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 
SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 
ACGAHRNYKKLLGSCCSHKERLSILRAELEALGMKGTPSLGKCR 
ALKEQREEAAEVASIiDVANI ISGSGRPRRRTAWNPLGEAAPPGB 
LYRRTLDSDEERPRPAPPDWSHMRGI I55DGESN 


6961 


340 


1646 


rpwssptmkpnfslrlrifnlncwgipylskhrad^KKEotfX 

KQES FDLALLE E VWS EQDFQ YLRQKLS PTY P AAHH FRS G X IGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAEYNRQKDIYIAHRVAQAWELAQFIHHTSKK 
ADWIiLCGDIiNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKP FPFGVR IDYVLYKAVSG FYI S CKSFET 
TTG FDPHRGT P LS DHEAliMATLFVRHS P PQQNP S STHG P \ AERS 
FL/MCVCLKEAIiDGSLGLGMA\ QAKWWA\TFA\ S YVIGLGL\ LL 
LALL C VLAAGGGAGEAA I LLWTP S VGL VL WAG AFYL FHVQE VNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYAIiL\LGQQEGDRTKEQ 


6962 


340 " 


1646 


RPWS S PTMKPNFSLRLR I FNLNCWGIPYLSKHRADRMRRLGDFL 
NQES FDLALLEEVWSEQDFQ YLRQKLS PTYPAAHHFRSG t IGSG 
LCVFS KHP IQELTQHIYTLNG YP YM 1 HHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAEYNRQKDIYLAHRVAQAWEliAaFIHHTSKK 
ADWLLCGDLNMHPEDIX3CCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNC YVSQQELKPFPFGVRID YVLYKAVSGFYI SCKS FET 
TTGFDPHRGTPLSDHEALMATLFVRHS PPQQNPS STHGP\AERS 
PL / MCVCLKEALDGSIjGLGMA\QAR W WA\TFA\S YVIGLGL \ LL 
LALLCVLAAGGGAGEAA2L LWTPS VGLVLWAGAF YIiFHVQE VNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6963 


374 


2618 


k. v i f Xj i ux±i1jBJ\ f K. l AoN UKASEENE I TQ PGGSS AK PGL PCLNF 
EAVLSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTER1HSIN 
LHNFSNS\^ETLNEQRNRGHFCDVTVRIHGSMLRAQRCVLAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQILTAAS ILQI KTVI DE CTRI VS QNVGDVF PG I QDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVPRIYSALYACSMQNGSGE 
RS F YSGA WSHHETALGLPRDHHMEDPS WITR IHERSQQMER YL 
STTPE TTHCRKQ PRP VRI QTL VGN IH I KQEMEDD YDYYGQQRVQ 
ILERNESEECTEDtDQAEGTES EPKGESFDSGVS S S IGTEPDS V 
EQQFGPGAARDSQAEPTQPEQAAEAPAEGGPQTNQLETGASSPE 
RSNEVEMDSTVITVSNSSDKSVLC2QPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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ID 
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corresponding 
to first 
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amino acid 
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Predicted end 
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location 
corresponding 
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sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C»Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G=*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R«Arginine, 
S=Serine, T«Threonine, Va Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFIiFSIiPQPLAGQQTQFVTVSQPGLSTFTAQLPAPQPLASSAGH ~ 
STASGQGEKKP YECTLCN KT FTAKQNYVKHMFVHTGEKPHQCS I 
CWRS FS LKD YL I K \ HMVTHTG VRA YQCS ICNKR FTQKS S LNVHM 
RLHRGEKSYECYICKKKFSHKTLLERHVALHSASNGTPPAGTPP 
GARAGPPG WACTEGTTYVCS VCPAKFDQ I EQFNDHMRMHVS DG 


6964 


1 


178 ' 


SGR P FFFFFSNTDVYF I KKVTNRWTAGSSY KMTRMKSIGK1 LLL " 
QI FIG\NCSMFVLVI 


6965 


757 


208 


NVFIEPRlQGFMKTSAHPGQKHPDFSMGLLFPLLAAIiEVCSCGS 
SGSLGY2vTLPQNH\GLLGRNTIiVIiLGQMRRISPFLCLKDRSDFRF 
PQBKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CC34EHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWVIEGSTLALRRY 
FQESISTLE 


6966 


820 


1867 


IITAliGVRGMPGCPCPGCGMAGPRLLFIiTAIALBLLGRAGGSQp' '" 
ALRSRGTATACRLDNKESESWGAIjLSGERLDTW 1 CSLLGSIjMVG 
LSGVFPLLVIPLEMGTMLRSEAGAWRIiKQLLSFALGGIiLGNVFL 
HLLPEAWAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFIiALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 
GP \D1 KVSGYLNLLANTIDNFTHGLAVAASFLVS KKIGLLTTMA 
ILUIE I PHEVGDFAILLRAGFDRWSAAKLQLSTALGGLIiGAGFA 
ICTQ3PKGVEETAAWVLPFTSGGFLYIALVNVLPDLLEEEDPW 


6967 


162 


633 


G FLPFKYWI LDLSAS SRMETDCNPMELSSMSGFEEGSELNGFEG 
TDMKDMRLE AEAWND VL FAVNNM FVS KSLRCADD VAY I NVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLQTPYHBTVYSLLDTL\ 
SPAYR£AFGKR\LLQRLEAI*KRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGIiQKT 
LEO. FHLSSMSS IiGGPAAFS ARWAQEAYKKES AKEAGAAAVPAP V 
PAATEPPPVLHLPAIQPPPPVLPGPFFMPSDRSTERCETVX»EGE 
TI S CFWGGEKRIjCIjPQ I LNS VLRD FS LQQ INAVCDELH I YCSR 
CTAD QLE I LKVMGI L PFS APS CGLI TKTDAERLCNALXj YGGAYP 
PPCKKELAASLALGLELSERS VRVYHE \CFGKCKGL\ LVPELYS 
S PS AACI QCLD \ CRLMYP PHKF WHSHKALENRTCHWGF \ DS A\ 
NWRAYILLSQDYTGKEEQARLGR\CLDDVKEKFDYGNKYKRRVP 
RVS S E PPAS I R P KTDDTS S Q S PAPS E KDKPS S WL RTLAGS S NKS 
LGCVHPRQRLSAFRPWSPAVSASEKELSPHLPALIRDSFYSYKS 
FETAVAPNVAIiAPPAQQKWSSPPCAAAVSRAPEPLATCTQPRK 
RKLTVDTPGAPETLAPVAAPEEDKDSEAEVEVESREEFTS S LSS 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKEKFLHEVVKMRVKQEEKLSAALQAKRS 
LHQELEFLRVAKKEKLREATEAKRNLRKEIER1,RAENEKKMKEA 
NESRLRLKRELBQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QHAEADREQLRADLLREREAREHLE K\WK\ELQEQLWPRARPE 
AAGSEG\AAELBP 


6969 


1855 


118 


AGTMHGRLKVKTS E EQAEAKRLEREQKLKL YQ S ATQAVFQKRQ A 
G BLDES VLELTS Q I IjGAN P DFATLWNCRREVLQQLETQKS PEEL 
mmjjvwwiu?* ij&o^ijKVIYFi^¥(yrwHHRCWLLGRLPEPNWTREL 
ELCARFLEVDERNFHCWDYRRFVATQAAVPPAEEIAFTDSLITR 
NFSNYSSWHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSF 
SRPLLVGSRMEILLLMVDDSPLIVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVIWTAGDVQKECVLLKGRQEGWCRDSTTDE 
QLFRCELSVEKSrVLQSEIiESCKEU2ELEPENKWCL\LTIIIiLM 

raldpllyeketlqyfqtlk\awdpkraty\lddlrskfllens 

VLKMEYAEVRVLHLAHKDLTVLCHLEQLIiVTHIiDLSHNRLRTL 
PPALAALRCLEDPPPRT\VLQASDNAIESLDGVTNLPRI J QELLL 
CNNRLQQPAVLQ PLAS CPRI* VLIiNIiQGNPLCQ AVG I LEQLAELL 
PSVSSVLT 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
HsHistidine, I^Isoleucine, K^Lysine, 
L=I*eucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , RsArginine, 
S^Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y=Tyroeine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


S FPPIjLSS PSAVGEGK VA VAAP CPGRSECARAKMA Y I QhB PLNE 
G F L S R I SGtiLIiCRWTCRHCCQKC YE SSCCQS SEDEVE I LGP FP A 
QT P P W LKAS RS SDKDGDS VHTAS E VPLTPRTNS P DGRRSSS DTS 
KSTYSLTRRISSLESRRPSSPLIDIKPIEFGVLSAKKEPIQPSV 
LRRTYNPDDYFRKFEPHLYSrjDSNSDDVDSLTDEBILSKYQLGM 
LHFSTQYDLLHNHLTVRVI BARDLP PP ISHDGSRQDMAHSNPYV 
KICLLPDQKNSKQTGVKRKTQKPVFEERYTFEIPFIiEAQRRTLL 
LTVVDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 
VELGELLLS LNYLP SAGRLNVDVI RAKQLLQTD VSQGSDPF VKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
FTVFGHNMKSSNDFIGRI VIG \QYS SGP\SE PNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVS PASLEVT 


£971 


37 


3 702 


ACFYVPGSRSFKJjIPRHGI>VNMGRSGKLPSGVSAKIiKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKIiHNELQSGSL 
RLGKSEAPETPMEEEAEIiVXjTEKSSGTFLSGLSDCTNVTFSKVQ 
RFWESNSAAHKE ICAVIiAAVTEVI RSQGGKETETEYFAALI RKA 

aqhgvcsvlkgsefmfekapahhpaaistakfciqeibksggsk 
eatttlhmltllkdllpcfpeglvkscs etllrvmtlshvlvta 
camqafhs i» fhar pgls tlsaelnaqii talyd yvpsendlqpi, 

IAWLKVMEKAHINLVRLQWDI^LGHLPRFFGTAVTCIjLSPHSQV 
LTAATQSLKE I LKECVAPHMADIGS VTSSASGPAQSVAKM FRAV 
EEGLTYKFHAAWSSVLQLLCVFFEACGRQAHPVMRKCLQSLCDL 
RLSPHFPHTAALDQAVGAAVTSMGPEWIjQAVPLEIDGSBETLD 
FPRSWLLPVIRDHVQETRLGFFTTYFLPLANTLKSKAMDLAQAG 

stveskiydtlqwqmwtllpgfctrptdvaispkgiartlgmai 
serpdlrvtvcqalrtlitkgcqaeadraevsrfaknflpilfn 
lygqpvaagdtpaprravletirtyltitdtqlvnsiilekasek 

VliDPASSDFTRLSVLDLVVAIiAPCADEAAISKLYSTIRPYLESK 
AHGVQKKAYRVLEEVCASPQGPGALFVQSHLEDLKKTIiLDSIjRS 
rSSPAKEPRLKCIiBHIVRKLSAEHKEFITATiIPEVILCTKEVSV 
GARKNAFALLVEMGHAFLRFGSNQEEALQCYLVLIYPGLVGAVT 
MVSCS ILALTHLLFEFKGIiMGTSTVEQLLENVCLIiLASRTRDW 
KSALGFIKVAWVMDVAHIAKHVQLVMEAIGKLSDDMRRHFRMK 
LRNLFT\KFIPK\FGILTWGKKAVGPKEYHRVLVNIRKAEARAK 
RHRAL S QAAVEE E EEEE EE E E PAQGKGDS I E E I IiADSEDBE DNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVIiA 
TQPGPGRGRKKDHSFKVSADGRLIIREEADGNKMEEEEGAKGED 
EEMADPMEDVIIRNKKHQKliKHQKEAEEEELEIPPQYQAGGSGI 
HRPVAKKAMPGAE YKAKKAKGDVKKKGRPDPYAYI PLNRS KLNR 
RKKMKLQGQFKGI>VKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAI LLPIiWRRTRPREATVPRGAAQRGRARSAEGRI PSSQSPS ' " 
PAE AGGATRS PP PRP PRPAR P PGPS APP LLRSDAG PGATVS AAA 
AAATERARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
A I TLBS PDI XYP1>RL I DRE I ISHDTRRFRFALPS PQHI LGIiP VG 
QH I YLS AR I DGNLWRP YTP I SS DDDKG FVDLVI KVYFKDTH P K 
FPAGGKMSQYLESMQIGDTIEFRGPSGLIiVYQGKGKFAIRPDKK 
SNPI IRTVKSVGM I AGGTG ITPMLQVIRAIMKDPDDHTVCHLLF 
ANQTEKD IIiLRPELEEIiRNKHSARFKLWYTZiDRAPEAWDYGQG \ 
F VNEEM I RDHL P P P E \EEPLVLMCG P P PM IQ YACL PNI* \DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGLRAQKCGRPAPGVDAMVLC PVIGKLIiHKRWLASA" ' 

S PRRQE I LS NAGLR FE WPS KFKEKLDKASFATP YG YAMBTAKQ 

KALEVANRLYQKDLRAPDWIGADTIVTVGGLILEKPVDKQDAY 

RMLSRFE/SGREHSVFTGVAIVHCSSKDHQLDTRVSEFYEETKV 

KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVBSVHGDFL 

NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
1 sequence 


Amino acid segment containing signal peptide ~ 
(A«Alanine, OCysteine, D=^Aspartic Acid, E= . 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HcHistidine, I-Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N^Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=«possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDVEGGGSEPTQRDAGSRDEKAEAGEAGQATAEAECHRTRETLP' 

PFPTRLLELIEGFMLSKGIiLTACKLKVFDXiLKDEAPQKAADIAS 

KVDASACGMERLLD I CAAMGLLEKTEQGYSNTETANVYLASDGE 

YSLHGPIMHNNDLTWNLFTYLEFAIREGTNQHHRALGKKAEDLF 

QDAYYQSPETRLRFMRAMHGMTKLTACQVATAFNLSRFSSACDV 

GGCTGALARELAREYPRMQVTVFDLPDI I E LAAH FQ PPG PQAVQ 

IHFAAGDFFRDPLPSAELYVLCRILHDWPDDKVHKLLSRVAESC 

KPGAGLLLVETLLBEEKRVAQRALMQSLNMLVQTEGKERSLGEY 

QCLLELKGFHQVQWMLGGVLDAIL\PPKWPPEAQAACSL 


6974 


3082 


2172 


RSCAAFAat'ASRPPLELFAPPGSHRSPPGRGVATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 
SLPTNIVPPTTIWTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSLLPKNIS I ESREEEITSPGSNWEGTNTDPS PSGFSSTSGGVH 
LTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPQAPASSPSSL 
STSPPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


RPRPTVHCCKWALKI>ETAMETLIJNVFKAHSGKEGDKYKLSKKEI* ' 
KE LLQTELS GF LDVKE LML* ATEALKT FEE A * KSPI IQCSSSRS 
SLP PAPQPPP YIi* LS AVPFP IHLPLPLLPPQAQKDVDAVDKVMK 
ELDENGDGEVDFQEYWLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL*VAYGTTENSPVTFAHFPEDTVEQKAESVGRIMPHTEARi 
MNMEAGTIiAKIjNTPGELCIRGYCVMbGYWGEPQKTEEAVDQDKW 
YWTGD VATMNEQGFCK I VGRS KDMI I RGGEN I YPAELEDFFHTH 
P KVQ EVQ WG VKDDRMGEE I CAC I RLKDGEETTVEE I KAFCKG K 

ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL+IKQO 
ACPGRIA 


6977 


1298 
3 4 


588 


SliFINTNLljSNQIRKTSFGMCSEPISDNTEDQKGKIiKTPDFA^R 
ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEWRWYQPEHMSFE 
ELLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KEN YQ KVLS EHG FG P I TTD IREGQTF YYAED YHQQYLS KNPNG Y 
CGLGGTGVSCPVGIKK 


6978 




242 


S FP FRDS RRCG CCKGS S JjRHTAVAM VKLS KEAKQRIjQQL FKGS Q 
FAIRWGF IPLVI YLGFKRGADPGMPEPTVLSLLWG 


6979 
6980 J 


3917 
1 


1146 

420 < 


DEARVRGEAVAAAILSRCRHWSGPPPFPPSPPDRKGLRGTEPWE 
AGPGSGATPGARAMDVRRLKVNELREEIjQRRGLDTRGLKTELAE 
RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSEIiEGT 
AQPPPPGLQPHAEPGGYSGPDGHYAMDNXTRQNQFYDTQVIKQE 
NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTSFIiPPEASQLKP 
DRQQFQSRKRPYEENRGRGYFEHREDRRGRSPQPPAEEDEDDFD 
DTLVA IDT YNCDLHF KVARDRSSG YPLTIEGFA YL WSGARAS YG 
VRRGRVCFEMKINEEISVKHIjPSTEPDPHWRIGWSIaDSCSTQL 
GE E P FS YG YGGTGKKS TNS RFENYGDK FAENDVI GC FADFE CGN 
DVEliS FTKNGKWMGI AFR IQKEALGGQALYPHVLVKNCAVEFNF 
GQRAEPYCSVIjPGFTPIOHL.pt J3RDTDf"n;rDVQvxpriT?TT ... „ . 

glpaagkttwaikhaasnpskkynilgtnaimdkmrvmglrrqr 

NYAGRWDVLIC^T^CI^RIilQJj^KKRNYIIiDQTNVYGSAQR 

rkmrpfegfqrkaivicptdedlkdrtikrtdeegkdvpdhavl 

EMKANFTbPDVGDFLDEVLFIELQREEADKLVRQYNEEGRKAGP 
PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 
PQQQPPPQQPPPPQPPPQQpppppsYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNQGGYSQ 

gytapppppppppaynygsyggynpapytppppptaqtypqpsy 
nqyqqyaqqwnqyyqnqgqwppyygnydygsysgntqggtstq 

3TRGRKTGRVAAPSrKRRlXJNMQKI^TRSPAMSI*SOPGLGYHPT 
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SEQ 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








C WTLRW P PLCSLHALHVFHCLFSSRLGTP VS PR LAMDPNC S CEA 

GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGC1CKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPHPS 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLS I FSQE YQ KH I KRTHAKHHTSEA I 
ESYYQRYLNGVVKNGAAPVLIjDIiANEVDYAPSLMARLILERFIjQ 
EHEETPPSKSI INSMLRDPSQI PDGVLANQVYQCIVNDCCYGPL 
VDCIKHAIGHEHEVLLRDLLLEKNLSFLDEDQLRAKGYDKTPDF 
ILQ V P VAVEGHI I HW I ES KAS FGD E CSHHAYLHDQ FW S YWNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTNIVTLCHSIA 


6982 


153 


1285 


FPQQDCSAPAAPGLAGSEPRRLRAYRRRRQRARGLKRVAWLAPP 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDT VETLMLHP VI KAFLCGS I S GTCSTLLFQPLDLLKTRLQTLQ 
PSDHGSRRVGMLAVLLKWRTESLLGLWKGMSPSIVRCVPGVGI 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTRYESGKYGYESIYAALRSIYHSEGHRGLFSGLTATLLRDAPF 
SGI YLM FYNQT KNI VPHDQVDATL I P ITNFS CGI FAG I LAS LVT 

QPADVIKTHMQLYPLKFQWIGQAVTLIFKDYGLRGFFQGGIPRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


EMS FLQDPS FFTMGMWS IGAGALGAAALAI*LLANTDVFLS KPQK 
AALEYLEDIDLKTLEKEPRTFKAKELWKKNGAVIMAVRRPGCFL 
CREEAADLS S LKSMLDQIjGVPLYAWKEH IRTEVKDFQP YFKGE 
I F LDE K K K F YG PQRR KMM FMGF I RLGVWYNFFRAWNGG F SGNLE 
GEG FILGG VF WGSGKQG ILLEHRE KEFGDKVNLLS VL E AAKM I 
KPQTLASEKK 


6984 


1845 


1282 


GGRS AYSLPAGS LPRVPATAAAKMASGVQVADE VCRI FYDMKVR 

KCSTPEEIKKRKKAVIFCLSADKKCXIVEEGKEILVGDVGVTIT 

DPFKHFVGMIiPEKDCRYALYDASFETKESRKEELMFFLWAPELA 

PLKSKMIYASSKDAIKKKFQGIKHECQANGPBDLNRACIAEKLG 
GSLIVAFEGCPV 


6985 


1887 j 


1324 


RRTAGI YPCFPKPGRTRHALCS WLLLLTGQLAFDD FQESCAMM 
WQKYAGSRRS M P LG AR I L FHG VF YAGG FAI VYYL I Q KFHSRAI» Y 
YKLAVEQLQSHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
IPVSGS KS EGLL YVHSSRGGP FQRWHLDE VF LELKDGQQ I P VFK 
LSGENGDEVKKE 


6986 


642 


1350 


YHLYFKMGDPNSRKKQALNRLRAQLRKKKESLADQFDFKMYIAF 
VFKEKKKKSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSIj 
ELLQKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIBKIVC 
LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 
VNN PN QS VFLF I DRQHLQTP KNKAT I FKLCS I CLYLP QEQLTHW 
AVGTI EDHLRP YMPE 


6987 


1623 


341 


LEAAE KAS RAFKESQRQTDS KN YETENWS PQKSQRR YDM YftTAC 
FLGE1EVGLYTIQILQLTPFFHKENELSKKHMVQFLSGKWTIPP 
DPRNECYIALSKFTSHLKNIiOSDLKRCFnvFTnVMvr t ia/dvt^ 
KE I AE I MLS KKVSRCFR KYTELFCHLDP CLLQS KE SQLLQEENC 
RKKLEALRADRFAGLLEYLirPNYKDATTMESIVNEYAFLLQQNS 
KKPMTNEKQNS ILANI ILSCLKPNSKL IQPLTTLKKQLREVLQF 
VGLSHQ Y PGP YFLACLLFWPENQELDQDS KL I EKYVS S LNRS FR 
GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 
SLWHSGDVWKKNEVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
ISVYSGPLRSGRNI ERVSFYLGFSIEGPPGL 


6988 ' ' 


3 


689 

: 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAAS TAAPQ DAQTG PQ PMPRADC I MRHLP YFCRGQWRG 
PGRGS KQLG I PTANFPEQWDNLPAD I STG I YYGWAS VGSGDVH 
mvVSIGWNPYYKNTKKSMETHIMHTFKEDFYGEILNVAIVGYL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K^Lysine, 
L=Leucine, M=>Methionine, N^Asparagine , 
P= Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
1 \=poaaiDie nucieociae insertion) 










RPEKKFDSLESLISAIQGDIEEAKKRUELPEHLKIKEDNFFQVS 
KSKIMNGH 




6989 
6990 


2 


1118 


LMPSDRPJjS PSTHASAGSHCHAPPTTARRAFPI PFGSKSNMATIj 
KDQIilYNLLKEEQTPQNKITWGVGAVGMACAISILMKDLADEL 
ALVDVIEDKLKGEMMDLQHGSIiFLRTPKIVSGKDYNVTANSPOiV 
IITAGARQQEGESRLNLVQRNVNIFKFIIPNWKYSPNCKLLIV 
SNPVD I LTYVAWKI SGFPKNR VIGSGCNLDSARFRYLMGERLGV 
HPLSCHG WVLGEHGDS S VP VWS GMNVAG VSL KTLHPDLG TDKDK 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADLAESIMKNiR 
R VHP VS TM I KGI* YG1 KDDVFhS VPC I LGQNG I S DI»VKVTLT£ EE 
| iSMJCJjKltiAOTLWGIQKELQF 






719 


258 


THASGMASVVLiAIjRTRTAVTSLIjSPTPATALAVRYASKKSGGSS 
KNLGG KS SGRRQG I KKMEGH YVHAGNI I ATQRH FRWHPGAHVGV 
GKNKCL YALEEGIVRYTKEVYVPHPRNTEAVDIiITRLP KGAVLY 
KTFVHWPAKPEGTFKLVAML 




6991 


169 


451 


RRSSDFHNPGFLSRPVSLRENIHHQVICSTKNKRRNPKkt'AYLL 

SSLLMTNLNPNESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 




6992 
6993 


944 


510 


RQAPGCSSlALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGAIjY 

RS PMNQENPPPYPGPGPTAPYPPYPPQPMGPGPMGGPYPPPQGY 

PYQGYPQYGWC^PQEPPKTTVYVVEDQRRDEIjGPSTCLTACWT 
ALCCCObWDMLT 


6994 


1 


374 


QW C VTCPQHJS) AkCjGPA V PPGIQA ¥ GAAPFKB^Q VDFTEMS KCRG 

DRVWIKNWNVASLCPLWKGPQTVVLSPPTAVKVEGIPAWIHHSH 
VKPAARE T WEARPS PDNP FR VTI>KKTTS PAP VTPGS 


6995 


346 


1100 


QWPEIQOPVMAASSISSPWGKHVFKAILMVLVALILLHSAIAQSR 
RDFAPPGQQKREAP VDVLTQIGRS VRGTLDAWI GPETMHIj VSES 
SSQVLWAlSSAISVAFFAliSGlAAQLLNALGIAGDYIAQGLKLS 
PGOVQTFLLWGAGALWYWLLSLLLGLVLALLGRILWGLKLVIF 
LAGFVALMRS VPDPSTRALI*LLALL I LYALLS RLTGSRASGAQL 
BAKVRGIiERQ VE ELRWRQRRAAKGARS VE EE 




144 


1346 


G S VAVGLSG I MAAQKD LWDAI V IGAG I QGCFTAYHLAKHRKR I L 
LLEQFFL PHSRGSSHGQSRI IRKAYI*EDFYTRMMHECYQIWAQI» 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRALQDArRQLG 
GIVRDGEKWEINPGLLVTVKTTSRSYOAKSXiVITAGPWTNQIili 
RPLGIEMPLQTLRINVCYWREMVPGSYGVSQAFPCE5 , LWIiGLCPH 
HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 
SSFVRDHLPDLKPEPAVIESCMYTNTPDEQFILDRHPKYDNIVI 
GAGFSGHGFKLAPWGKILYELSMKL1TPSYDZ1APFRISRFPSLG 




6996 


543 


1942 r 
- 

| 1 


ETANAEAAAKKSAMDWKBVIjRRRIjATPNTCPNKKICSEQELKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRIiPAENEVLIiQKIjR 
EESRAVFLQRKSRELLDNEELC2NLWFLLDKHQTPPMIGEEAMIN 
YENFLKVGEKAGAKCKQFFTAKVFAKLLHTDSYGRISIMQFFNY 

vmrkvwlhqtriglslydvagqgylresdlenyileliptlpql 

DGLEKS FYS FYVCTAVRKF FFFLDPLRTGK I K IQD IIACS FLDD 

llelrdbblskesqetnwfsapsalrvygqylnldkdhngmlsk 

EELSRYGTATMTNVFLDRVFQECLTYDGEMDYKTYIiDFVIiALEN 

^kepaalqyifklldienkgylnvfslnyffraiqelmkihgqd 
pvsfqdvkdeifdmvkpkdplkislqdlinsnqgdtvttilidl 
>jgfwtyenrealvandsensadlddt 




6997 


370 


1104 J 

J "5 


WELTIFILRLiAIYILTFPIiYLLNFLGIiWSWICKKWFPYFLVRF 
tVIYNEQMASKKRELFSNLQEFAGPSGKLSLLEVGCGTGANFKF 
^PPGCRVTCIDPl^NFBKFLIKSIAENRHLQFERFVVAAGENMH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re sp ond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V»Valine, 
W=Tryptophan , YsTyrosine, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QVADGSVDVVVCTL>VIjCSVKNQERILREVCRV1iRPGGAFYFMEH 
VAAECSTWNYFWQQVLDPAWHIjLFDGCNLTRESWKALERASFSK 
LKLQH IQAPLS WE LVRPH I YG YAVK 


6938 


2 


616 


FVSRAliLRVRS RRHPAEERAAPGRPEDAP IECPGATNCPEPLWC 
SHLP VP YAPPTME S RGKSAS S PKPDTKVPQVTTEAKVP PAADGK 
AP LT KPSKKEAPAE KQQP PAAPTTAPAKKTS AKADPALLNNHSN 
LKPAPTVPSS PDATPEPKGPGDGAE EDEAAS GG PGGRGPWS CEN 
FNPLLVAGGVAVAAI AIjI LGVAFLVRKK 


6999 


14 


1591 


GRAGACSRRDTAMS IE IESSDVIRLIMQYLKENSLHRALATLQE 
ETTVS LNTVDS IESFVAD I NSGH WDTVLQAIQSLKLPDKTLI DL 
YEQVVLELIELREX>GAARSI*IiRQTDPMIMI*KQTQPERYIHLENL 
LARS YFDPREAYPDGSSKEKRRAAIAQALAGEVS WPP SRLMAL 
LG QAL KWQQHQGLL PPGMT I DLFRG KAAVKD VE EEKFPTQLS RH 
IKFGQKSHVECARFSPDGQYLVTGSVDGFIEVWNFTTGKIRKDL 
KYQAQDNFMMM DD AVL CMC FSRDTEMLATGAQDG K I KVWKIQSG 
QCLRRFERAHS KGVTCLS FS KDSSQ I LSASFDQTI R IHGLKSGK 
TLKEFRGHSS FVNBATFTQDGHY 1 1 S ASSDGTVKI WNMKTTECS 
NTFKS LGS TAGTDI TVNSVI LLPKNPEHF WCNR S NZWTMNMQ 
GQ I VR S FS SG KRE GGDFVCCALS P RGEW I YCVGEDFVL YCFSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLLQPALTGDVEGLQK1FEDPENPHHEQAMQLLLEEDIVGRN 
LIi YAACMAGQSDV I RALAKYG VNLNE KTTRGYTL LHCAAAWGRL 
ETLKALVELDVDI EALNFRE ERARDVAAR YSQTECVEFLDWADA 
RLTLKKYIAKVSLAVTDTEKGSGKLLKEDKNTILSACRAKNEWL 
ETHTEAS INELFEOKQQLEDIVTPI FTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 ! 


2056 


844 


RRCLIIAFLKGCFIFIYFlFIFETEFIiSCCPGWSAVAQSRLIAN 
FASQVQAI FILPKDSQVGPDVKSEAAPKRAI*YES VFGSGE I CGP 
TS PKRLCIRPSEPVDAWWS VKHDPLPLLPEANGHRSTNS pti 
VSPAI VS PTGDSR PNMSRPLITRS PAS PLNNQGI PTPAQLTKSN 
AP VH I D VGGHMYT S S LATLTKY PES RI GRLFDGTEP I VLDS LKQ 
H YFI DRDGQMFR Y I LNF LRTS KLL I PDDFKD YTLL YEE AKYFQL 
QPMLLEMERWKQDRETGRFSRPCECIjWRVAPDLGERITLSGDK 
SLIEE VFP E IGDVI-ICNSVNAGWNHDSTHVIRFPLNGYCHIiNS VQ 

vlerlqqrgfeivgscgggvdssqfseyvlrrelrrtprvpsvi 
rikqepld 


7002 


1043 


498 


PMPSS TRWTTS *TYTDTSSAWACRFTTGTCT* TAAPGPTVR WWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGI.CVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRS CCSRPATT PPSKPGAPHAPCAS S RHLAHGIiAPSS PGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFIiQPPGMRLSALIiALASKVTLPPHYRYGMSPP 
GSVADKRKNP P W IRRRPVWE P ISDEDW YIiFCGDTVEI LEGKDA 
GKQG KWQVI RQRNW VWGGLNTHYR YIGKTMDYRGTMI PSEAP 
LLHRQVKLVDPMDRKPTEIEWRFTEAGERVRVSTRSGRIIPKPE 
FPRADGIVPETWIDGPKDTSVEDALERTYVPCLKTLQEEVMEAM 
GIKETR\NTRRSIGIEPGAEQLL,PNFCPSLEG 


7004 


121 


2285 


FLIiPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G \ PKRTLKTQLG/ Y YCRVR P LGFP DQECCI E V INNTTVQLHTP E 
G YRLNRNGD YKETQ YS FKQVFGTHTTQKELFDWANPL VNDL IH 
GKNGLL FTYG VTGSGKTHTMTGS PGEGGLLPRCLDM I FNS I GS F 
QAKRYVFKSNDRNSMDIQCBVDALIjERQKREAMPNPKTSSSKRQ 
VDP E FAD M I TVQBFC KAEE VDEDS VYG VF VS Y IE I YNN Y I YD L L 
EEVPFDPINPNLHNLNCFVKI KNHNM YVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNIKLVQAPLDADGDNVL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C« Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
ii=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T«Threonine, V- Valine, 
W-Tryptophan, Y-Tyrosine, X- Unknown, *=Stop 
Codon, /..possible nucleotide deletion, 
Vpossible nucleotide insertion) 








QEKEQITISQLSLVDLAGSERTNRTRAEGNRI^REAGNINQSLMT " 
I*RTCMDVLRENQMYGTNKMVPYRDSKLTHLPKNYFDGEGKVRMI 
VC VNPKAE D YEENLQVMR FAEVTQEVE VARP VDKAI CGLTPGRR 
YRNQPRGP\IGNEPLVTDVVLQSFPPLPSCEILDINDEQTLPRL 
IEALEKRHNLRQMMIDEFNKQSNAFKALLQEFDNAVLSKENHMQ 
GKLNEKEKM ISGQKLE I ERLEKKNKTLEYKI EI LEKTTTI YEEP 
KRNLQQELETQNQKLQRQFSDKRRjbEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAI VTE PKTEKPBRPSRER 
DREKVTQRSVSPSPVPVSYL 


7005 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAERLGLFEEL 
WAAQVKRIASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQLARQ 
I S S TLADTAVAAQVNGE P YDLER P LETDSDLR FLTFDS PEGKAV 
FWHSSTHVLGAAAEQFIiGAVLCRGPSTEYGFYHDFFIiGKERTIR 
GS ELP VLE R I CQEIiTAAARP FRR LEASRDQLRQLFKDNP FKLHL 

IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGQIGGLKLLSNSSS 
LWRSSG 


7006 


22 


898 


nafgrhstavkmaaaawlqvlpvi llllgaHpspls ffsagpat~~ 

VAAADRSKWHIPIPSGKNTOSFGKILFRNTTIFLKFDGEPCDLS 

lnitwylksadcyneiynfkaeevelyleklkekrglsgkyqts 
sklfqncselfktqtfsgdfmhriipiilgekqeakengtnijtfig 

DKTAMHEPLQTWQDAPYIFlVHIGISSSKESSKENSiSNIiFTMT 
VE VKGP YE YLTLEDY P LMI F FMVM C I VYVL FGVLWIiAWS ACY WR 
D L LR I Q FW I GAV I FLGMLE KAVF YAGFQ 


7007 


2 


1001 


AMTVSGPGTPEPRPATPGASSVE(iLRJtE6NELFKCGDYGGALAA 
YTQALGLDATPQDQAVLHRNRAACHLKI»EDYDKAETEASKAIEK 
IXIGDVKALYRRSQALEKI^RLDQAVIJDLQRCVSLEPKNKVFQEA 
IiRN I GGQ I QEKVRYMS STDAKVEQMFQ I LLDPEEKGTEKKQKAS 
QNL WLAREDAGAEKI FRS NGVQL LQR LLDMGE TDLMLAALRTL 
VGICSEHQS RTVATLS I LGTRRWS ILGVES Q AVS IiAACHLLQ V 
MFDALKEG VKKGFRGKEGAI I VGEWKQVWGLLDVTVMEGMGLSQ 
PGQFFGDQTCSCRLFGIRFGDI ILL 


7008 


70 


1478 


CRS ALGHERP PPAHL PAGGR RLQTCPRS CRWLGRP PSGLP PGPR 
S PP PLAGPGQKMVQKKPAELQGFHRSFKGQNPFELAFSLDQPDH 
GDSDFGLQCSARPDMPASQP ID I PDAKKRGKKKKRGRATDS FSG 
RFEDVYQLQEDVLGEGAHARVQTCINLITSQEYAVKI I EKQPGH 
IRSRVFREVEMLYQCQGHRNVLELIEFFEEEDRFYLVFEKMRGG 
S I LSH IHKRRHFNELEAS VWQDVASALD FLHNKG I AHRDLKPE 
NILCEHPNQVS PVKICDFDLGSGI KLNGDCS PISTPELLTPCGS 
AE YMAPEWEAFSEEAS I YDKRCDLWSLG VI L YILLSG YPPFVG 
RCGSDCGWDRGEACPACQNMLFES IQEGKYE FPDKDWAHI SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHFLLP PHPCRIHVRPGGLVRTVTVNE 


• 7009 


1 


626 


ARQLRNS W VDDFVAAPLI PLSQQI PTGNS L YES YYKQ VD PAYTG 
RVGASEAALFLKKSGLS DI ILGKI WDLADPEGKGFLDKQGFYVA 

LRLVACAOS(3HF.VPT»^TJT/KIT CMDODlfffUrvPC enr iunri>nr>onr>nTr 

WAVR VEE KAKFDG I FESLLPINGLLSGDKVKPVLMNSKLPLDVL 
GRVWDLSD I DKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLS PLCPLLGGGTAMSGGEQKPERY YVGVDVGT " 
GS VRAALVDQSGVLLAFADQP IKNWEPQFNHHEQSSEDI WAACC 
WTKKWQG IDLNQI RGLGFDATCSLWLDKQFHPIjPVNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


riqtlpnqnqsqtqpllxtppavl0piap0ttf6vqtqpqpqsl 
lqaqisaasitpllqtqpqpllqqpqqkagllqppvrivsqpqp 
arrldppsrfsgrndrgdqvpnrkddrsrerererrrsrerspq 
rkrsrersprrerersprrvrrwprytvqfskfsldcpsc3dmm 
elrrryqnlyipsdffdaqftwvdafplsrpfqlgnycnfyvmh 
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SBQ- 

ID 

NO: 


Predicted 
faea i mi ino 
nucleotide 
location 
cor re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
(A=Alanine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N«Asparagine , j 
P=Proline, Q^Glutamine, R^Arginine, 
S« Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








REVESLEKNMAILDPpDADEiLYSAKVMLMASPSMEDLYHKSCAL 
AEDPQELRDGFQHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 
DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


! 2661 


RRAGSVKRGSARLFGPTE^QSEkPLRPSAARRPBMLSGKKAAAA 
AAAAAAAATGTEAG PGTAGGSENGS BVAAQ PAGLSG PAE VGPGA 
VGERTPRKKEPPRASPPGGIAEPPGSAGPQAGPrWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESIiANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI 1 I GSGVSGLAAARQLQSFGMD VTLLEARDR VGGRV 
AT FRKGNY VADLGAMWTGLGGNPMAWS KQVNMELAK I KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIBHWKKIVKTQEELKELLNKMV 
NLKEKI KELHQQYKEAS EVKP PRD I TAE FLVXS KHRDLTALCKE 

YDELAETQGKIiEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTIiSbKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFIYKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPS VNLFGKVGSTTASRGELFLFWNLYKAP I LLALVAGEAAG 
IMENISDDVIVGRCIAILKGIFGSSAVPQPKETWSRWRADPWA 
RG S YS YVAAGSSGND YDLMAQP ITPG PS I PGAPQ PI PRL FFAGE 
HT I RN YPATVHGAL LS GLREAGR I ADQFLGAM YTLPRQATPG V P 
AQQSPSM 


7013 
7014 


1 


2661 
• 


RRAGSVKRGKARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGS EVAAQPAGLSGPAEVG PGA 
VGERTPRKKE PPRAS P PGGLAE PPGS AGPQAGPTVVPGSATpME 
TG 1 AE TPEG \ RRTS RR KRAKVE YREMDESLANLS EDE YYS EEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFIiFIRNRTLQIiWLDMPKIQL 
TFEATLQQLEAP YNSDTVLVHRVHS Y LERHGL I NFG I YKR I K PL 
PTKKTGKVI 1 1 GSGVSGLAAARQLQSFGMD VTLLEARDR VGGRV 
AT FRKGNY VADLGAMVVTGLGGNPMAVVS KQ VNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWI QLQEKHVKDEQIBHWKKI VKTQEELKELLNKMV 
NLKEKI KBLHQQ YKEAS E VKPPRD I TAE FLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLS LKHWDQDDDFEFTGSHLT VRNG YS CVP VALAEG 
LDI KLNTAVRQVRYTAS GCE V I AVNTRS TSQTF I YKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPS VNLFGHVGS TTASRG ELFLFWNLYKAP I LLALVAGEAAG 
IMENI SDDVI VGRCLAILKGI FGSSAVPQPKETWSRWRADPWA 
RGSYS YVAAGSSGND YDLMAQPITPGPS IPGAPQPI PRLFFAGE 
HT I RNYPATVHG ALLSGLR EAGRIADQFLGAM YTLP RQAT PG V P 
AQQSPSM 




3 


3950 


DFEVGDKIRiLATLEDGWLEGSLKGRTGIFPYRFVKIi'CPDTRVE 
ETMALPQEGSLARIPETSLDCLEtJTLGVEEQRHETSDHEAEEPD 
CI I SEAPTS PLGHLTSE YDTDRNS YQDEDTAGGPPRS PGVBWEM 
PLATDSPTSDPTEWNGISSQPQVPFHPNLQKSQYYSTVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
S VSASR WKPRQSS PQLHNLAS YTKKHHTSSVYS ISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDSKLTQQLIEFEKSLAGPGTEP 
DKILRHFS IMDFNSEKDI VRGSS KLITEQELPERRKALRPPPPR 
PCTPVSTS PHLLVDQNLKPAPPLWRPSRPAPLP PS AQQRTNAV 
3PKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
U3LDMYSRAQEELNLMLEEKQDESSRAETLEDLKFCESNIESLN' 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F~Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L*= Leucine, M=Methionine, N=Asparagine , 
P» Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y*= Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTERD Y I RDLEM C I ERIM VP MQQAQVPNI DFEGLFGNMQMV 
I KVS KQLLAALE I SDAVG P VFLGHR DELEGTYKI YCQNHDEAI A 
LLE I YEKDEKI Q KHLQDS LADLKS LYNEWGCTNY IKLGS FLIKP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKE1NVNINE 
YKRRKDLVLKYRKGDEDS LMEKI S KLNIHS 1 1KKSNRVSSHLKH 
LTGFAPQIKDE VFEETE KNFRMQERIiI KS F I RDLSL YLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 
TERLV IS PLNQLLSMFTGPHKLVQKRFDKLLDF YNCTE RAEKLK 
DKKTLEELQSARNNYBALNAQLLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQAIiEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSB 
ELRAS LLARYP PEKL FQAERNFNAAQDUDVS LLEGDLVG VI KKK 
DPMGS QNRWL I DNGVTKGFVYSS FLKPYNPRRSHSDAS VGSHSS 
TESEHGSSSPRFPRQNSGSTI/TFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 

sadvardvkqptatprsyrnfrhpeivgysvpgrngqsqdlvkg 
cartaqapedrste pdgseaegnqvyfavytfkarnpnelsvsa 
nqklkiiiefknvtgntewwlaevngkkgyvpsnyirkteyt 


7015 


1B42 


513 


RQAWHE WAAPSWRGARIiVQSVLRVWQVGPHVARERVI P FSSLL 
GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVIJIiVHHPDMPENSRVIiRVVLIjGAPNAG 
KSTLSNQIiLGRKVFPVSRKVHTTRCQAliGVITEKETQVILLDTP 
GIIS PGKQKRHHIiELSLIjEDPWKSMES adl vwlvdvs dkwtrn 
QLSPQLLRCLTKYSQIPSVIjVMNKVDCLKQKSVLLEIiTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 

fkeifmlsalsqedvktlkqylltqaqpgpweyhsavltsqtpe 
eicani irbkliehlpqevp ynvqqktavweeg pggelviqqkl 
lvpkesyvklligpkghvisqiaqeaghdlmdiflcdvdirlsv 

KLLK 


7016 


167 


2513 


HiNAPKPPPPRDSVEAVAAKRDTGGGSWGTGMDVSGQETDWRST 
AFRQKLVSQ IEDAMRKAG VAHS KSS KDMES HVFLKAKTRDE Y LS 
LVARLI IHFRDI HNKKSQAS VSDPMNALQSLTGGPAAGAAG I GM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSSSSSSRRRYSSSSSSSNSKQ 
FQAQQSAMQQ\QFQA\WQQQQQ^\QQQQQQQQHhIKLHHQKQQ 
QIQQQQQQLQRIAQLQIiQQQQQQQQQQQQQQQQALQAQPPIQQP 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVS O^QAIiPGQMLYTQPPLKFVRAPMVVQQPPVQP 
QVQQQQTAVQTAQAAQMVAPG VQVSQS SLPMLSSPS PGQQVQT P 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\ QS PVTARTPQNFS VPS PGPLNTP VNP S S VMS PAGS SQAEEQ Q Y 
LDKLKQLSKYlEPLRRMINKIDKNEDRKKDLSKMKSliLDILTDP 
S KRCP LKTLQKCE I ALE KliKNDMAVPTPP P ? PVPPTKQQYLCQP 
LLDAVLANIRSPVFNHSDYRTFVPAMTAIHGPPITAPWCTRKR 
RLEDDERQS I PS VLQGEVARLDPKFLVNLDPSHCSNNGTVHLI C 
KLDDKDLPSVPPLELSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HRCmSRhhQhPDKHS VTALLNT WAQS VHQACLS AA 


7017 


1 


1785 


INLGNTCYMNS V I * ALFMATDFRRQ VLSLNLNGCNSLMKKLQHL 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFLLDR 
LHEEEKILKVQASHKPSEILECSETSLQEVASKAAVLTETPRTS 
DG EKTLI E KMFGG KLRTHT RCLNCRSTSQKAEAFTDLS LAFWPS 
YS LEYMS CPDCS QS PS I QDGGLMQAS VPG PS EE PWYNPTTAAF 
I CDSLVNEKTI GSPPNEF YCSENTS VPNESNKILVNKDVPQKPG 
GETTPSVTDLLNYFLAPEILTGDNQYYCENCASLQNABKTMQIT 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 


Predicted end' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seyment containing signal peptide - 
{A=Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G^Glycine, . 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, v=Valine, 
W-Tryptophan, Y«Tyroeine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








FSSLSESWSVDVDFTDLSENLAKKLKPSGTDEASCTKLVPYLLS 
SVWHSGISSESGHYYSYARNITSTDSSYQMYHQSEALALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDSRVTFTS FQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
LMDAITKDNKLYLQEQELNARARAtiQAASAS CS FRPNGFDDNDP 
PGS CGPTGGGGGGGFNTVGRLVF 


7018 


484 


1066 


S LVFRGNTWSGE AGHHCSAL FNLAAYHQL F VGTER IRAPE I 1 FQ ~~ 
P S I» X GEEQAG I AETLQ YI LDR Y P KDVQEML VQNVFLTGGNTM Y P 
GMKARMEKELLEMRPFRSSFQVOLASNPVLDAWYGARDWALNHL 

DDNEVWITRKEYEEKGGEYX.KEHCASNIYVPIRLPKQASRSSDA 
QASSKGSAAGGGGAGEQA 


7019 


1048 


335 


APGGFLVTMVFPAPSPPWMLGCCSHEVTAGPPTLCKDMSALVAA 
RMRHIPLAPGSDWRDLPNIEVRLSDGTMARKLRYTHHDRKNGRS 
SSGAIiRGVCSCVEAGKACDPAARQFNTLIPWCLPHTGNRHNHWA 
GLYGRLEWDGFFSTTVTNPEPMGKQGRVLHPEQHRWSVRECAR 

SQGFPDTYRLFGNILDKHRQVGNAVPPPLAKAIGLEIKLCMLAK 
ARE SASAKI KEEEAAKD 


7020 


1 


2154 


FADSKRKSVLLDKIKNLQVALTSKQQSLETAMSFVARNTFKRV^H 

NGFLMRICVAVFFSNTPTRASPQLREAVLKLSDAGITPIiFLTRQE 

DRQLINALQ INNTAVGHALVLPAGRDLTDFLENVLTCH VCLDI C 

N1DPSCGFGSWRPSFRDRRAAGSDVDIDMAFILDSAETTTLFQF 

NEMKKYIAYLVRQLDMSPDPKASQHFARVAWQHAPSESVDNAS 

MPPVKVEFSLTDYGSKEKIiVDFLSRGMTQLQGTRALGSAIEYTl 

ENVFESAPNPRDLKIWLMIjTGEVPEQQLEEAQRVILQAKCKGY 

FFWIX?IGRKVNIKEVYTFASBPIiDVFFKLVDKSTELNEEPLMR 

fgrllpsfvssenafylspdirkqcdwfqgdqptknlvkfghkq 
vnvpnnvtss ptsnpvtttkp vtttkpvttttkp vttttkp vt i 
inqpsvkpaaakpapakpvaakpvatktatvrppvavkpataak 
pvaakpaavrppaaaaakpvatkpevprpqaakpaatkpattkp 

MVKMSREVQVFEITENSAKLiHWERPEPPGPYFYDLTVTSAHDQS 
IjVL KQNLT VTD RVIGGLLAGQTYHVAWC YIjRSQ vrat yhgs fs 

tkksqppppqparsassstinlmvsteplaltetdicklpkdeg 
tcrdfilkwyydpntkscarfwyggcggnenkfgsqkecekvca 

PVIiAKPGVI S VMGT 


7021 
7022 


2 


338 


VNAVSFFPNGYAFATGSDDATCRIjFDLRADQELLLYSHDNIICG 
I TS VAFS KS G RIiLLAGYDDFNCNVWDTLKGDRAGVLAGHDNR VS 
CLGVTDDGMAVATGSWDS FLRI WN 




2 


856 


vyigsfwshpllipdnrklfeaeeqdlfrdiqslprnaaLrkln 
dlikrariakvhayi isslkkemps vfgkdnkkkelvnnlaeiy 
grierehqispgdfpnlkrmqdqlqaqdfskfqplkskllewd 
dmlahdiaqlmvlvrqeesqrpiqmvkggafegtlhgpfghgyg 

EGAGEGIDDAEWVVARDKPMrDEIFYTIiSPVDGKITGANAFGCEM 

VRSKLPNSVIiGKIWKIjADIDKDGMLDDDEFAIiANHLIKVKLEGH 
ELPNELPAHLLP PSKRKVAE 


7023 


2 


748 


AMVFGG WPY VPQ YRD IRRTONADGFSTWCLVr. r.un mtt orrt? — 

WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADS KDEEVKVAPRRS FLDFDPHHFWQ WSS FSD YVQCVLAFTG 
VAGYI T YIjS I DSALFVETLGFLAVL TEAMLG VPQL YRNHRHQS T 
EGMS I KMVLM WTSGDAFKTAY FLLKGAPLQFS VCGLLQVLVDIA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 

] 


RTGVTGWAQVWMFGGGGVLSSGEQLQMPVKPERGLGPSDGWLV 
SSRRGSPGTVLGLPFWLLTPVLVSRSIRSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RliLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDICPDELEKLVQV 
/RQLEAEPGLPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGIi 
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ID 
NO: 


XT icuxcicQ 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
oniino uciq 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A=Alanine f C=Cysteine, D-Aspartic Acid, Es | 
Glutamic Acid, p a Phenyl alanine, G=Glycine, j 
H=>Histidine, I=Isoleucine, K=Lysine, I 
L^Leucine, M=Methionine, N^Asparagine , j 
PsProline, Q=Glutamine, R=Arginine, j 
S-Serine, T=> Threonine , VaValine, j 
W«Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, j 
\=possible nucleotide insertion) | 








TGSTKQVAQASHSYRVYYNAGPKDEDQDY I VDHS IAIYLLNPDG 
LFTDYYGRSRSAEQI SDSVRRHMAAFRSVLS | 


7025 




832 


ernspignnenl*k\hsldclcfrgdwegotqpqtlqdn'qbegtH 

KQVIRTCEKRPTFNQHTVFNLHQRLNTGDKLNEFKELGKAP1SG 

sdhtqhqlihtsekfcgdkecgntflpdseviqyqtvhtvkkty 

ECKECGKSFSLRSSLTGHKRIHTGEKPFKCKDCGKAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 


326 


1146 


npnpsigdxkdikkaaksmldpahkshfhpvtpslvflcfifdgH 

LHQAliLSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
SSVIjRHCCDLLIGVAAGSSDKICTSSLQVQRRFKAMMASIGRLS 

hgesadlliscnaesaigwissrpwvgelmftflfgdfesplhk 
lrkss*lprkhr*qpinavrmfi,dqcmdgsialraivseipvfe 
ekknng* kg igei f * wgctlpphywgavttnvpklsnsgkllg 
qdeqphifg 


7027 


43 


954 


grrlqqqqrpedaedgaegggkrgeagweggypeivkenklfeh H 

YYQEIiKIVPEGEWGQFMDALREPLPATLR I TG YKSHAKE I LHCL 
KNKYFKELEDbEMDGQKVEVPQPLSWYPBELAWHTNLSRKH^K 
SPHLEKFHQFLVSETESGNISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGS KTTQLI EMLHADMNVP F PEG FVI ANDVDNKF C YLLVH 
QAKRLS S PCI M WNHDAS S I PRLQ I D VDGRKE 1 LFYDR I LCD VP 
CSGDGTMRKNlDVWKKWTTLNSLQLHGIiQLRIATRGAEQL j 


7028 


189 


608 


SRP P PEPEPGTMVEKGSDSS SEKGGVPGT PSTQSLGSKN Fl RNS 
KKMQSWYSMLSPTYKQRNEDFRICLFSKLPEAERLIVDYSCALQR 
E I LLQGRL YLS ENW I c FYS N I FRWETT I S I QLKEVT CLKKEJCTA 
KLIPNAIQ 


7029 


1343 


40 


VIiE SNTE AKQATGTS S KLRHGTGQE KGREG PR C PSGIAQLRLWG~| 
/PCPHAGRETGPRASAPI PGS *GHGWHW*RKDGRGERS EGPSAL 

sphspsllnmqqapthvgpgmgsqrprsswpeqvgvgsqlsre 

RWRA* RSLPGAAASERTEMTKERSP/RPCQGYDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCLLYPLQSIMPE*QLR*GAHASPPTQG 
R* G KGGPRS PLTKAS GTTH I PTP FFGS I P/RPTRDSGPGTDNS \ 
AAPGQKRGHREA * QGPE PV/ WGRVTTHLQGPAG * TKPLGS \ RNW 
VPGPAJEGEOGEGAGLEGRP*PLKGCRSTLTFSPQLSIPMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 

PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 1 


7030 
" 7031 


2 


521 


FVCFSAPGSGQGGKRRVI^ELSAVGERVFAAE^U^KRRIRKGW| 
EYLVKWKGWSQKYSTWEPEENILDARLLAAFEEREREMELYGPK 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGIRIPYPGRSPQI>L 
ASTSRAREGtiRN \RVCPRQRAAPAPAAP \ PRKGPSGPGPRPG* G 
PGLH F PG PGGPS KHGFVPAS EQHQHQQHLPRRGPS GPGPRPG 




960 


59 


hcsvpgaewprkppaqicpqltsrphlssprslspgcghspgpgH 

/ CKPS /RHCDELHEGPSRTAALPCGKPQPKHGVEECG / PCPCLA 
PRRLTEPPAbTVSPVGRAAPSGAL*PSGRACSACSHRIAPEAAL 
S AAAPR PS LGSGQNASGLPAAS LP PQDS SQPHKTVP S PARS VP P 
LGAQARAAP PRLWC PRALVSG * E AS PEAVS VAAGP P VPGPT PS T 
SGSTASHSRRGC* S PR*TPAP PRRDHGRS AAFEVLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL I 


7032 
! 7033 


1393 
<*89 


2104 
815 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHLAPL»QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
EPWMKRQFGRLHSLFWKS WQKMNS FLLTPKLDTSLMSGWR YRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 1 








RSRDCLSSSATSNRARRSKCSGPKRATPLDSQPGP*APPGPSSA " 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q^Glutamine, R^Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMM PSSCPWRTGALGPS PAGSRALGRCTS SVG PGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WDWRRPPLQVS PAPSSSCRASCCWCLES IT* S SSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPS P * GVCSGSSGHAGIiAljGKPPVACS VP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMF YHAYDS YLENAFP FDELRPLTCDGHDTWGS FSLTL I DALD 
TLL\TLFYFQI LGNVSE FQRWEVLQDSVDFDI dvnasvfetni 
R WGGLLS AHLLSKKAG VE VEAG WP CSG P LLRMAEEAAR KLL PA 
FQT PTGMP YGTVNLLHGVNPGET P VTCTAG IG T F I VE PATLS S L 
TGD P VFED VAR VALMRL WESRSD I GLVGNHI DVLTGKW VAQDAG 
IGAGVDSYFEYLVKOAILLQDKKLMAMFLEYNKAIRitfYTRFDDW 
YIiWVQM YKGTVSMPVFQS LEAYVIPGLQSL1GD IDNAMRTFLN Y Y 
TVWKQFGGLPEFYNI EQGYTVEKREGYPLRPBLI ESAMYLYRAT 
GDPTLLELGRDAVES I E KI SKVECGFATI KDLRDHKLDNRMES F 
FLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHP IDPAALHCCQRLKEEQWEVEDLMRE F YSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTS KLALLGQVFLDSS * PIiDNFFI F I FLRLN YNKLLLAI I KK 
K 


7035 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLTLIDALD 
TLL\ TLFYFQ I LGNVSE FORWEVLODS VDFDTnvMn<2\TPPTKr t 

RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMP YGTVNLLHGVNPGETP VTCTAG IGTF I VE FATLS SL 
TGD PVFED VAR VAXiMRLWE S RSDIGL VGNH ID VLTGKWVAQDAG 
I GAGVDS YFE YLVKGA I LLQDXKLMAM FLE YNKAIRNYTRFDDW' 
YLWVQMYKGTVSMPVFQSLEAYWFGLQSLIGDIDNAMRTFLNYY 
TVWKQFGGLPEFYNI PQGYTVEKREGYPLRPEL I ESAMYLYRAT 
GDPTLLELGRDAVES IE KI S KVECGFATI KDLRDHKLDNRMES F 
FLAETVKYLYLL FDPTNFIHNNGS TFDAVI TP YGECI LGAGG YI 
FNTEAHP I DPAALHCCQRLKEEQWEVEDLMREF YSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
Q PFTS KLALLGQ VFLDS S * PLDNFFI FI FLRLNYNKLLLAI I KK 
K 


7036 


442 


751 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN*LPGEGPS I PT 
RNW*ERKAGCSQPC/ PAQQHHGRPPGVS PLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
Q YNKLLEKSDLHSVLAQKLQAEKHDVPNRHEI S PGHDGTWNDNQ 
LQEMAQLR I KHQEELTELHKKRGELAQ \RVIDLNNQMQRKDREM 
QMNEAK IAE CLQT I S DLETECLDLRTKL CDLERANQ TLKDEYDA 
LQI TFTALEGKLRKTTEENQELVTR WMAEKAQEANRLNARE * KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7039 


155 


B91 | 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL " 
Q YNKLLEKS DLHS VLAQ KLQAEKHDV PNRHE I SPGHDGTWNDNQ 
LQEMAQLRI KHQEELTELHKKRGELAQ \R V I DLNNQMQR KDREM 
QMNEAK IAE CLQT I S DLETE CLDLRTKLCDLERANQTLKDE YDA 
LQI TFTALEGKLRKTTEENQE LVTRWMAE KAQEANRLNARE * KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
G YE S VMRDS EATG SAS SAQDSTS ENSSS VGGRCRS LKTP KKRSN 
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SEQ- 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 

corresponding 

to first 

amino acid 
1 residue of 
1 amino anirf 

sequence 


Amino acid segment containing signal peptide — 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=*Phenylalanine, G~Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y*=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 


^7041 






FGSQRRRLIPALSLDTSSPVRKPPNSTGVRWVDGPLRSSPRGLG 
EPFEIKVYSIDDVERLQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEPDLEQVWELDSLE 
J YliEAriECVTERLESRVNPCKAHLMMITCFDIT 




1 


1 567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL 
NDGYDWGRLNLQSVTEQSSLDDFLATAELAGTEFVAEKLNIKFV 
PAEARTGLLS FEESQRI KKLHEENKQFLC I PRR PNWNQNTTPE E 
L KQAE KDNFLEWRRQ L \ VRLE E EQKL1 LT P FE RNLD F WRQLWR V 
IERSDIWQIVDA 


7042 


7 


345 


PIHMAAAAL,KADI\ISPLFPHIQGYLLLSASHG\ATSLHTKGAL 
PLETVTMYTVI PKS KYVLVKPDTQYP YS ENLDEFKRLAENSASN 
DDLLMAEVAISDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDsusEEDIjVSYGTGLEPLEEGERPKKPIPLQDQTVRD 
EKGRYKRFHGAFSGG FSAG YFNTVGSKEGWTPSTFVS SRQNRAD 
KS VLG PED FMDEEDLS E FG I AP KAI VTTDD FAS KTKDR IRE KAR 
QLAAATAP I PGATLLDDLI TPAKI*S VGFELLRKMGWKEGQGVGP 
R VKRR PRRQKPDPG VKI YGCAL P PG S S EGS EGEDDD YLPDNVT F 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 
S ERAGDLGE IGLNKGRICLG ISGQAFG VGALEEEDDD I YATETLS 
KYDTVLKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILDGF 
S LAS KP I>S S KKI YP P PE LPRD YR PVHY FRPM VAATS ENSHLLQ V 
LS E SAG KAT PDPGTHS KHQLNAS KRAELLG ETP I QGS ATS VLE F 
LSOKDKERIKSMKQATDLKAAQJbKARSLAQWAOSSRAQPSPAAA 
AGHCSWNMALGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THA KE EDDSDQVEVPRDQ END VG DKQS A VKMKM FGKLTR DTFE W 
HPDKLLFQ/Rt»VGIiPRVKRDKYSVFNFIiTLPETASLPTTQASSE 
KVSQHRGPDKSRXPSR WDTSKHEKXEDS I SE FLRLARS KAEPPK 
QQSSPLVNKEEEHAPELSAN 


7044 
?045 


276 


734 1 


EVYLTDEFAKGRKVADItYELVQYAGNIIPRL^LtlTVGVVYVKS 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG 

eptdeettgdisdsmdfvllnfaemnklwvrmqhqghsrdrekr 

ER ERQELR IL VG TNLVRLSQ V 


7046 ■ 


3 


513 


LG FKMEALS RAGQEMS IiAALKQHD P Y I TS I ADLTGQVALYT FCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DLE FQLHE P FLLYRNAS LS I YS I W F YDKNDCHR I AKLMAD WEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAfiRHPGDAEQSQG 


7047 


3 


513 f 


LGFKMEALSRAGQEMSiAALKQHDP Y ITS I ADLTGQVALYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DLE FQLHEP FLLYRNAS LS I YS I WF YDKNDCHR I AKLMAD WEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 




103 


466 


QM K I E KCG Ws EGLTS I KGNCHNF YTAI S KDVT YKELKNL LNS KN 
IMLIDVREIWEILEYQKIPESINVPLDEVGEALQMNPRDFKEKY 
NEVKPSKSDS/ I VFSYLAGVRSKKALDTAISLGFHSYYBR 


f U4 O 

7049 


92 


627 f 
938 1 


FFCLTLLSSWUYRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDLAMTYKQRAENTQBELREFQEGSREYEAELETQLQQIETRN 
ROLLS ENNRLRMELETIKEKFEVQHSEGYRQISALEDDLAQTKA 
I KDQLQK Y I RELEQANDDLERAKRATDHGLS KTFE \QRLN\ Q A I 
SKKW 


7050 


393 
393 


1 y 
3 
1 

| J 

538 | 2 


KKTGSASYGGPPPGLGGPATXASVAGRCSS VGKI PARRCYEDEL 

^VFEAVGRIYELRLMMDFDGKNRGYAFVMYCHKHEAKRAVREL 

WyEIRPGRLLGVCCSVDNCRLPIGGIPKMKKREEILEEIAKVT 

3GVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
\SSLWG 

CRTGSAS YGGPPPGLGGPATXAS VAGRCS5 VGKI PARRCYEDEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

1 o ca fc 5 on 

corresponding 

to first 

amino acid 
1 residue of 

amino acid 
1 sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acad, F=Phenylalanine, G=Glycine, 
HaHistidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«=Asparagine , 
P=Proline, OGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyroeine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


7051" 






VPV t tAVw j. x isijKJjMMDFDGKNRGYAFVMyCHKHEAXRAVREI. 
NNYE I R PGRL LG VCCS VDN CRLF IGG I PKMKKRE E I L EE IAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7052 


119 


816 


KKMNLAE 1 CDNAKKGRE YALLGNYDSSMVY YQGVMQQI QRHCQS 

VRDPAIKGKWQQVRQELLEEYEQVKS1VGTLESFKIDKPPDFPV 

SCQDEPFRDPAVWPPPVPAEHRAPPQ1RR/RQSRSKTSEERNGR 

SRS PGTCRPST\ PISKSEKFSTSRDKD YRARGRDDKGRKNMQDG 

ASDGEMPKFDGAGYDKDLVEALERDIVSRNPSIHWDDIADLEEA 
KKLLREAGVLPMWM 


7053 


467 




S C P GRGKMS KLLNPEE MTSRD Y Y FDS YAHFG I H E EML KDE VRTL 
TYRNSMYHNKHVFKDKVVLDVGSGTGILSMFAARQGPRR 


7054 


467 


1 715 


S CPGRQKMS KLLNP EE M TS RD Y Y FDS YAHFG I HEEM L K DE VRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGIIjSMFAARQGPRR 


" 7055 


1 


I 1036 


GTSQRSRETDARRRSAGAEPTARLPWPAALBEWPSCPCEPLGPG 
RRCRWDAMEYDEKIiARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGbSBDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPN IR VLLEHR FYKEKSKS VKQTCDKCNTT I WGLI QTWYT 
CTGCYYRCHSKCIiNLISKPCVSSKVSHQAEYELNICPETGLDSQ 
DYRCAECRAPI /CS/IX3VVPSEARQCDYTGQYYCSHCHWNDLAV 
I PAR WHNWD FE PRKVS R CSMR YLALMVSR P VLRLRE I N 


7056 


2 


527 


DSRRVSWRSWI^E/WGKH^LFIWLS^LLFWKTFLLYNQGP 

EYHYLHQMLG/ALCLSRASASVLNLNCSLILI.PMCRTLIiAYLRG 

SQKVPSRRTRRLLDKSRTFH ITCGATI CI FSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVLFb 
M 


7057 


2 


527 


DSRRVS WRS WIAI^/ WGKHLCJjFI WLSM^fVLLFWICTFLL l YNQC^ P 

EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 

S QKVPSRRTRRLLDKSRTFHI TOGATI CI FSGVHVAAHLVNALN 

FSVNYSEDFVEI*NAARYRDEDPRKLLFTTVPGLTGVCMEVVI>7L 
M 


7058 


1368 




G I YXiHVN E KI PRPTC I GURQ ENDKENLULENHRDQELLHAS CQA 
SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 
S PQER I S EKQLGQHL PNPHSGEMSTM WLEE KRETSQ KGQPRAPM 
AQ KLPTCRE CGKTF YRNSQL I FHQRTHTG ET Y FQ CT I CKKAFLR 
SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGIiRHHEKIHTGEKP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
^Q RSHL GKKPFQ*PVTKLSFPISISQPSHKNTQLHQEELCLR 


7053 


1 


469 


FSGFGAVPDALGCRMSDLRITEAFLYMDYLL-bRALCCKGPPPAR 
PEYDLVCIGLTGSGKTSIjLSKLCSES PDNWSTTGFS IKAVPFQ 
NAI LNVKE LGG ADN I RKYWSRY YQGSQGVI FVLDS AS S EDDLE A 
ARN*SCTQLLQHPQLCTLPFLILA 


7060 


1 J 


1178 

< 


WPAFPRQPAAAAMDALLGTGPRRARGCLGAAGPTSSGRAARTPA " 
APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 
CYLSFPDSHSGCLGDTQFSFRTOQCGGQRSPWHADDRHYNSRAP 
VALQRE PAH YFG YVY FRQ VKDSS VKRG YFQKSLVLVS RL P FVRL 

FQALLSLIAPEYFDKLAPCLBAVCSEIDQWPAPAPGQTLNLPVM 
3 VWQ VR IPS R VDKS ES S P P KQ FDQENLLPAP WLAS VHELDLF 

RCPRp\n^THMQTLWELMLLGEPLLVLAPSPDVSSEMVLALTSCL 
2PLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNWLGVTNPFFIK 
rLQHWPHILRVGEPKMSGDLPKQViCLKKPFKV*RPWDTKP 




90 


1670 

■ 


dVNLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLVNPSQ 
niFEHLVTQMKWRLQEGRGEAVYQIGVEDNGLLVGLAEEEMRAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G^Glycine, 
H=Histidine, l=Isoleucine, K= Lysine, 
L^Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S«Serine, T=Threonine, V=»valine, 
W=Tryptophan, Y=Tyrosine, X« Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKTLHRMAEKVGADI TVLREREVDYDSDMPRKI TEVLVRKVPIW" " 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
IjHE I QS GRTS S I S FE I LGFNS KGE VHG I NGTQWGQTLRMGW * + * 
RT * DGGRVWRLFE I V* MNALRGL*TSSAPLRKSMGNQI*N* I KNG 
VKI KRQGHPGNGLG PGKSEGVGRAGRRH* GPWALGQVVNYSDSR 
TAEEICESSSKMITFIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
S ANTGI AGTTREHLGIiAIiALKVPFFI WS KI DLCAKTTVERTVR 
QLERVLKQ PGCHKVPMLVTSEDDAVTAAQQFAQS PNVTPI FTLS 
S VSGES LDLLKVFLNI LPPLTNS KEQEELMQQLTEFQVDEI YTV 
PEVGTWGGTLSR*IDLLATIiPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPSPLGPPCLPVMDPETTLEEPETARLRFRGFCYQEVAGPRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQRPGS PEEAAALVEGLQHDP * ARMPS PLGPPCLPVMDPETTL 
E E PETARLRFRG FCYQE VAGPREAIARLREIjCCQWIiQ PEAHS KE 
QMLEMLVLEQ FLGTIiP P E I QAWVRGQRPG S PEEAAAL VEGJbQHD 
PGQLLG 


7062 


71 


744 


AKAGTNLERLHWIjS Y F FC I PKH KLKS SQKDKVRQFMACTQAGER 
TAIYCLTQNEWRIjDEATDSFFQNPDSLHRESMRNAVDKKKIiERL 
YGR YKD PQDENKIG VDG I QQFCDD LS LDPAS I S VLVI AW KFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQEIiKDTAKFKD 
F YQFTFT FAKNPGQKGLDIj *MAGAYWKLV1»SGRFKFI» YLWNTFL 
MEHH 


7063 


2 


562 


lrtvpdlpgrrframrtgOrr * peLppdmnsLeqaedlkaferr 

LTE Y IHCLQ P ATGRWRMLL I VVS VCTATGAWNWLI DPETQKVS F 
FTSLWNHPFFTlSCITLIGLFFAGIHKRWAPSriAARCRTVlA 
EYNMS CDDTGKLI LKPRPHVQ* QSS L I VMGLKIAFLR 1 SDTAKS 
HKGFLLRLDM 


7064 


300 


684 


RDTGSDPSSTRRLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
S RRTG CWTC P PES GHAQARRSRRAS AS R WGARGAVRS AVAARGC 
SSRAGRWLETPGRRRGPPACAAAAGRLRGPAP*AAPPTASVPAR 
CRC PAARTGAPAAATWLRRRLSGLRAPALGRRRS PGPS PKSAAP 
PLLTPLGAGRAGGSRANS 


706S - 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLERIGKGSFGEVFKGIDNRTQQWAIKIIDLEEAEDEIEDI 
0QE I TVLSQCDSS YVTKYYGS YIiKGSKLWI IMEYLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASLRSNVRAATMMQICDT 
YNQKHSLFNAMNRFIGAVNNMDQTVMVPSLLRDVPIADPGLDND 
VGVE VGGSGGCL EERTP P 


7067 


152 


973 


KEN I TMATE I GS P PR FFHM PRFQHQAPRQLF Y KRPDFAQQ QAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAGYYNDLVPPIGMLNNPMNAVTTKFVRTSTNKVKCPVFWRW 
TPEGRRLVTGAS SGEFTLWNGLTFNFET I LQAHDS P VRAMTWSH 
NDMWMLTADHGG YVKY WQSNMNNVKMFQAHKEAI REARF IHNI P 

FSWPIVfWKLFSKCILGAEMHGLCQFLGNFIiHPINTlFFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKE YVLLIjFLAIjCSAKP FFS PSH I AXjKNMMIiKDMEDTDDDDD 
DDDDDDDDDDBDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCSDLGLTSVPTN I P FDTRMLD LQNN K I KB I KENDFKGLTSLY 
GLIIiNNNKLTKIHPKAFLTTKKLRRLYIiSHNQLS EI PLNLPKSL 
AELR I HENKVKKI QKDT FKKK 


7069 


1147 ■ " 



1765 


FRDHRRY F YVNEQS GESQ WE F P DGEEEEEES QAQENRDETIiAKQ 
TLKDKTGTDSNSTESSETSTGSLCKES FSGQVS S SSLMPLT P FW 
TI*LQSNVPVLQPPLPr*EMPPPPPPPPESPPPPPPPPPAPKMPPP 
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ID 
NO: 


fi.cUJLCl_.BCl 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "1 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G^Glycine, ! 
Hs-Histidine, I-Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , j 
P=Proline, Q=Glutamine, R=Arginine, \ 
S^Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknown, *«stop 
Codon, /=possible nucleotide deletion, | 
\=possible nucleotide insertion) ( 








EKTKKGRKD KAXKSKTKMPSLVKKWQS IQREIiDEEDNSSSSEED 1 
RVSTAQKRIEEWKQQQLVSGMAERNANFEA | 


7070 


1 


547 


DGTMEDSEAVQ RATAL I EQRIiAQEEENEKLRGDARQKljPMDlZjV 1 
ItEDEKHHGAQS AAIiQKVKGQER VRKTSIjDLRR E 1 1 DVGG IQNL I 
ELR KKRKQKKRDALAASHE P PPE P E E ITGPVDEETFLKAAVEGK 

MKVIEKPIjADGGSADTCDQFRRTAI*HRASLEGHMEILEKIjIjDNG 
ATVDPQ ( 


7071 


2 


921 


ARGTLRAI.ETAKKVGKVGANGQKAAGPSADSVTENKIGS PPKTP 
VSNVAATSAG P SNVGTELNS VPQKS S P FLTR VPAY PPHS EN I QY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSMNV 
PBSSI,PPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYORDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTI.NRGEGS 


7072 


2 


921 


argtlraletakkvgkvgangqkaagpsadsvtenkigsppktpH 

VSNVAATS AGP SNVGTE LNS VPQKS S PFLTR VPAYPPHSENIQ Y 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSI.PPASMPYADHYSTFSPRDRMNSSPYQPPPPQpyGPVPPV 
PSGM YAP VYDSRRI WRPPMYQRDDI IRSNSLPPMDVMHSS VYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHI.KTSCEEQ 
_IRRKPDQWAQ YHTQ KAPLVS ST LPVATQS PTP PS TLNRGEGS I 


7073 


50 


S04 


LiAHGSFGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMPl 
LVRYRKWI LGYRCVGKTS LAHQFVEGEFS EGYDPTVENTYSKI 1 

VTLG30)EFHLHLVDTAGQDEYSILPYSFIIGVHGYVliVYSVTSL 
HSFQV I ES LYQKLHEGHGK 


7074 


263 


1003 " 


VCP VLCSTKQEPGHS s l vt yfgkptrr kefllghci aagkmnis 
VDLETN YAELVLDVGRVTLGENS RKKMKDC KLRKKQNER VS RAM 
GALLNSGGG VI KAE I ENED YS YT KDG IGLDLENS F SNI LI»F VPE 
YIJ^FMQNGNYFIjIFVTCSWSIiNTSGLRITTLSSNLYKMITSAKV 
MNATAALE FLKDMKKTRGRIi YL RPELiI-AKRPRVD I QEENNM KAL 
AGVFFDRTEIiDRKEKI.TFTE5THVEI 


7075 


598 


1005 


nyinfffrkeypphvqkveinpvrlsrlqgverimkkteese'sqH 

VEPEI KRKVQQKRHGST YQPTPPLSPAS KKCItTHLEDLQRNCRQ 

AITLNESTGPI.LRTSIHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7076 
7077 " 


279 


1049 


LQSESSNAAEGNEQRHEDEQRS KRGG WS KGRKRKKPIjRDSNAP K 1 
SPLTGYVRFMNERREQLRAKRPEVPFPEITRMLGNEWSKLPPEE 

kqryldeadrdkerymkeleqyqkteaykvfsrktqdrqkgksh 

RQDAARQATHDHEKETEVKERS VFDI PI FTEEFLNHS KAREAEL 

rqlrksnmefeernaaliqkhvesmrtaveklevdvlqersrntv 
lqqhletlrqvltssfasmplpexgetptvdtidsym 1 




3 


1119 


SSMGSNSElNGIoALRICrDKYGFLGGSQYSGSLKSSIPVDVARQRH 

elkwldmfsnwdkwlsrrfqkvklrcrkgipsslrakawqylsn 
skelleqnprkfeelerapgdpkwldviekdlhrqfpfhemfaa 

RGGHGQQDLYRILKAYTIYRPDEGYCQAQAPVAAVLLMHMPAEQ 
AFWCLVQI CDKYLPG Y YS AGLEAIQLDGB I FFALLRRAS PLAHR 
HLRRQR I DPVLYMTEWFMCIFARTLPWAS VLRVWDMFFCEGVKI 
IFRVALVLIiRHTLGSVEKLRSCQGMYETMEQLRNXPQQCMQBDF 

LVHEVTNLPVTEALIERENAAQLKKWRETRGELQYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS | 


7078 


483 


767 ~ 


FOGQRMAGEQKPSSNLLEQFII.IAKGTSGSAI.TALISQVLEAPG 
VYVFGE LLE LANVQELAEGANAA YLQ IiLNLFAYGTYPD Y IANKE 
SLPELY [ 


7079 


2 


37* 

J 


SWEFKRPKEPSGSDGESDGPIDVGQEGQjLSQMARPLSTPSSSQ 1 
^QARKKRRGIIEKRRRDRINSSLSELRRLVPTAFEKQGSSKLEK 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=»Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, NsAsparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AEVLQ^5TVDHLKMLHATGGTGTHALLPQASFIQQIF " 


7080 


200 


595 


VQLPLEAPCIiSIjIiSCRDHSGGNRDLSRRHRDCRVYGS PQDGIPY 
LTHPLCHQDWSVGRLQIRALATPGHTQGHLVYLLDGEPYKGPS 
CLFSGDLLPLSGCGEFPRKREELGEEGETEVRAATVPWRALKP " 


7081 


213 


506 


A VTEEEM I I>NS LS LC YHNKL I LAPM VR VGTLPMRLLALD YGAD I 
VYCEEIilDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 
RWPQMGTS 


7082 


3 


1137 


APSRNTMi^WCRGPVLLCLRQGLGTNSFLHGLGOEPPEdARSfJT" 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATEVEERHVSPSCSTSRERPFQAGELILAETGEGETKFKK 
LFRLNNFGLliNSNWGAVPFGKI VGKFPGQ I LRSS FGKQYMLRRP 
ALED Y WLMKRGTAITFPKDINM I LSMMDINPGDTVLEAGSGSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDIAKKNYKHWRDSWKLSH 
VEE WPDNVD F I HKDI S GATED I ICS LTFDAVALDMLN PH VTL PVF 
YPHLKHGGVCPVYWN I TQVI ELLD 


7083 


115 


541 


RSNAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTQQLLSEP 
SPKAPRARPCRVSTADRSVRKGIMAYSIiEDLLLKVRDTLMLADK 
PFFLVLEEDGTTVETEEYFQAXiAGDTVFMVLQKGQKWQPPSEQG 
TRHPIjSLSHK 


7084 


3 


522 


NS VS VSSQSRFIiASVPGTGVQRSAAADMAASTAAGKQRI PKVAK 
VKNKAPAEVQITAEQLLREAKERELELLPPPPQQKITDEEEIjND 
YKLRKRKTFEDN I RKNRTVI SNW I KYAQ WEE S LKE I QRARS I YE 
RALDVDYRNI TLWLKYAEMEMKNRQVNHARNI WDRAITTL 


7085 


243 


1499 


RQLARLRRRGWRSPFGGAPMAHITINQYLO^VYEAIDSRDGASC - 

AELVSFKHPHVANPRLQMASPEEKCQQVLEPPYDEMFAAHLRCT 

YAVGNHDF I EAYKCQTVI VQS FLRAFQAHKEENWALP VM YAVAL 

DLRVFANNADQQIiVKKGKSKVGDMliEKAAEMiMSCFRVCASDTR 

AGIEDSKKWGMLFLVNQLFKIYFKINKLHLCKPliIRAIDSSNLK 

DDYSTAQRVTYKYYVGRKAMFDSDFKQAEEYLSFAFEHCHRSSQ 

KNKRMILIYLLPVKMLIiGHMPTVELLKKYHLMQFAEVTRAVSEG 

NLLIiLHEALAKHEAFFIRCGIFLILEKLKIITYRNLFKKVYLLD 

KTHQLSLDAFLVALKFMQVEDVD IDEVQCIIiANLI YMGHVKGYI 

SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


ILAARMG KQNS KLR PEVMQDL t*E STDFTEHE IQE W YKGFLRDCP 
SGHLSMEEFKKIYGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 
EF 


7087 


166 


723 


lsgssagkvaapcvppsnhelvpittenapknwdkgegasrgg 
ntrksledngstrvtpsvqphlqpirnmsvsrtmedsceldlvy 

VTERI IAVS FPS TANEENFRSNLRE VAQMLKSKHGGNYLIi fnls 
ERRPD I TKLHAKVIiEFGW PDIjHTPAIjEKI cs I CKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAASPSSLLEMAGEITETGELYSSYVGLVYMFNJLIVGTGALT 
i»it*xvj\r ji.iw*\Nu v^JaVLiijVr JjGFMSFMTTTFVIEAMAAANAQI»HW 
KRMENLKEEEDDDSSTASDSDVLIRDNYERAEKRPILSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLFYFCIIVYLYGDLAIYA 
AAVPFSLMQVTCSATGNDSCGVEADTKYNDTDRCWGPIiRRVD 


7089 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLPPGTMPSAS DWIGI FKVEAACVRDYHTFVWS S VPESTTDG 
SPIHTSVQFQASYLPKPGAQDYQFRYVNRQGQVCGQSPPFQFRE 
PRPMDELVTLEEADGGSDILLWPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTEIiRSRVQELERAIATARQEHTELMEQYKGISRS 
HGE I TEERD ILSRQQGDH VAR IIiELBDD I QT I S EKVLTKE VELD 
RLRDTVKALrREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL. 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAEIiEP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spcmd ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H-Histidine, I»Isoleucine, K=Lysine, 
L-Leucine, M-Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 
S=Serine, ^Threonine, V« Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








LKEQliRGAQE LAAS SQQKATLLGEEIASAAAARDRT I ABLHRSR 
LBVAEVNGKLAELGLHIjKEEKCQWSKERAGLLQSVEAEKDKILK 
LSAEILRliEKAVQEERTQNQVFKTELAREKDSSLVQLSESKREfi 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7090 


33 


1775 


S VCWEDRYLKARMEESPLSRAPSRGG VNFLKVARTYI PNTKVEC 
HYTIjPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
S P IHTS VQFQAS YLPKPGAQL YQ FR YVNRQGQVCGQS PPFQFRE 
PRPMDELVTLEEADGGSDIIiLWPKATVLQNQIjDESQQERNDIiM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVIiTKEVELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHIj 
NLDIjKE AKS WQEEQS AQAQRLKD KVAQMKDTLGQAQQR VAELE P 
LKEQDRGAQEIiAASS QQKATLLGE EIiASAAAARDRTI AELHRS R 
LEVAETOGKLAEIiGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 
LSAE ILROLEKAVQEERTQNQVFKTELAREKDSSLVQI»S ES KREL 
TELRSAIiRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLS CPAAI/TDS EDE S PE DMRLHPMAFVS VETQ 
ASLLLGLE 


7091 


186 


1076 


EGMIiTREHRCGRSBEQEIiEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSliAKIPKRASMVHSLIEAY 
ALHKQMRIVKPKVASMEEMATFHTDAYLQHLQKVSQEGDDDHPD 
SIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAIN 
WSGGWHHAKKDEASGFCYLNDAVLGILRLRRKFERILYVDLDLH 
HGDGVEDAFS FTS KVMTVS LHKFS PGFFPGTGD VS DVGLG KGR Y 
YSVNVPXQDG I QDEKY YQ I CER YEPPAPNPGL 


7092 


522 


809 


KQGINEDQEESQKPRLGEGCEPISKRQMKKLIKQKQWEEQRELR 
KQKRKEKRKRKKIiERQCQMEPNSDGHDRKRVRRDWHSTLRLI I 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQASMVRMS FVIAACQLVLGLLMTSX.TESS I QNS 
ECPQLCVCEIRPWFTPQSTYREA 


7094 


2 


508 


FVRSMHWGVGFASSRPCWDLSWNQSISFFGWWAGSEEPFSFYG 
DI I AF PLQD YGG IMAGLGSDP WWKKTL YLTGGALIiAAAA YliLHE 
LLVIRKQQE IDS KDAI ILHQFARPNNGVPSLS P FCLKMETYIiRM 
ADLPYQNYFGGKLSAQGKMPWIEYNHEKVSGTEFII 


7095 


1 


411 


IASSLPKMASLIjQSDRVr,YtiVQGEKKVRAPLSQLYFCRYCSELR " 
S LECVS HE VDSH YC PS CLENM PS AEAKIiKKNRCANCFDCPGCMH 
TLSTRATS I STQLPDDPAKTTMKKAY YIACG FCRWTS RDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPS QAGRRRS S RI S FAGAL FLTR FLIiQELLLNN FC 
SAMSPAPDAAPAPAS ISLFDLSADAPVFQGLSLVSHAPGE AIiAR 
APRTS CSGSGERE S P ERKLLQGPMD 1 SE KI*FCSTCDQTFQNHQE 
QREH YKLD WHRFNLKQRLKDKPLLSALDFEKQS STGDLSS I SGS 
EDSDSASEEDLQTIiDRERATFEKLSRPPGFYPHRVLFQNAOGQF 
L YAYRCVLG PHQDP PEEAELLLQNLQS KGPRDCWLMAAAGHFA 
GAI FQGRE VVTHKTFHR YTVRAKRGTAQGLRDARGGPSHSAGAN 
LRRYNEATLYKDVRDLLAGPSWAKAIiEEAGTILLRAPRSGRSLF 
FGGKGAPLQRGDPRLWDIPLATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRLHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGQ 
NEESPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTOSSQAVAAPLGPL 
LDEAKAPGQPELWNALIiAACRAGDVGVLKLQLAPSPADPRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSWRLLIiEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRI ITEPCEANAGSRQELQTER ISS 
FLAAQGD QAFHSGLETNNSNS ELPLRVGLKVAQG S PLMGGQ VS A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residua of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
<A=Alanine, C«Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H»Histidine, I=*Isoleucine, K» Lysine, 
L=Leucine, M=Methionine, N=?Asparagine 
P=Proline, Q^Glutamine, R^Arginine, 
S^Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








SNSFSRLHCRNANEDWMSALCPRLWDVPLHHLS I PGSHDTMTYC 
LNKKSPISHEESRLLQLLNKALPCITRPWLKWSVTQALDVTEQ 
LDAG VR YLDLR I AHMLEGS EKT^HFVHMVYTTALVE DTLTE I S B 
WLERHPRE WI LACRNFEGLSEDLHEYLVACI KNI FGDMLCPRG 
EVPTLRQLWSRGQQV I VS YEDESSLRRHHELWPGVP YWWGNRVK 
TEAL I RYLETMKSCGR 


7098 


82 


956 


S S FL KRCRKVLG CWG IPS EQS L FS TLEEPRDKEI DN YC VMRLQT 
EARSG FWAPNRFP VN I CRMTAVDGDRGGSS RETCRCHFH PS LEA 
LVLLLQDWQPGGVG I CT S FLG I S WAL LDYHRAL RTC LPS KPLLG 
LGSSVIYFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
LLWVWLC^TDFMPDPSSEV^YRVTVATILYFSWFNVAEGRTRGR 
Al IHFAFLLSDS I LLVATWVTHS S WL PSG 3 PLQLWLPVGCGCFF 
LGLALRLVYYHWLHPSCCWKPDPDQYD 


7099 
7100 


992 


210 


LFRLAPGFLRSLARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
ARSLPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSEEPGPGAD 
GAVLEVHVPQIGAGVSLPGILAAKCGAEVILSDSSELPHCLEVC 
RQS CQMNNLPHLQ WGLTWGHIS WDLLALPPQDI ILASDVFFEP 
EDFEDILATIYFLMHKNPKVQLWSTYQVRSADWSLEALLYKWDM 
KCVHIPLESFDADKEDIAESTLPGRHTVEMLVISFAKDSL 




one 


671 


ANGG F W EAAPGS EVSLP LWVPTASHSKTTALG IGSAPPPHLS VL 
! FLFSFPPQLGDPLEAFPVFKKYDRNGLNVS2ECKRVSGLEPATV 
DWAFDLTKTNMQTMYEQSE WGWKDREKREEMTDDRAW YL I AWEN 
SS VP VAFSHFR FDVERGDEVLYW 


7101 


2 


503 


WRGG PRRAKiUoAGGAVGWVLLVRGVHS VRAGGGRPPRAAJDMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRWI PLINERTDKDSRLPLILGGNKSDLVEYSR 


7102 


2 


503 


WRGG P RRAKRLAGGAVGWVLLVRGVHS VRAGGGR PPRAADMKKD 
VR ILLVGEPR VGKTSLIJMSLVSEEFPEEVP PRAE E I TI PADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRWI PLINERTDKDSRLPLILGGNKSDLVEYSR 


7103 


119 


438 


GSQSS VAVN IRS GTDEE SMDLMMGQAS S VN IAATASE KSS S S ES 

LSDKGSELKKS FDAWFDVLKVTPEE YAGQITLMD VPVFKAI QP 
DELSSCGWNKKEKYSSAP " 


7104 


1670 


795 j 


RLWEHRSVSAGASGWGLSSPGCLLLHPSLPEEBRVDILINWAGV 
MRC PHWTTEDG FE MQ FGVNHLGE AW AG AAP WVQA I L PRRP P K VL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRtINLSSLA 
HVAGH I DFDDLNWQTRKYNTKAAYCQS \ KLAI VLFTKELSRRLQ 
GSGVT VNALHPGVARTELGRHTG IHGS TFLQHHN \ WAHLLAAWS 

KSPRSWPAPAQHNTLAVAEELAWISGKYFDGLKQKAPAPEAED 
EE VARRLWAES ARLVGLEAPS VREQPL PR 


7105 
7106 


765 


143 


GQMCRR PS PKSTS CLSMTCDLP / RGLQD PQCLALFRVAVDKHQA 
LLKAAMSGQGVDRHLFALYIVSRFLHLQSPFLTQVHSEQWQLST 
SQIPVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVSYIFMGDG 
MITFHISSKKSS TKTDSHRLGQH I EDALLD VAS LFQAGQHFKRR 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7107 


14 
1145 


1064 j 
I 

; 

591 [~S 


GLQAGHPHPRSASRIPEADTH\YSKLQRAFDSIVNKDHKRMFGT 
YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRLEAFYGQCF 
3AE FVE VI KDSTP VDKTKLD PNKAY I Q I TFVEP Y FDE YEMKDRV 
rYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNTVLTTMHAF 
PYIKTRISVIQKEEFVLTPIEVAIEDMKKKTLQLAVAINtqeppd 
WCMLQMVLQGSVGATVNQGPLEVAQVFIiAE I PAD P KLYRHHNKL 
ILCFKEFIMRCGEAVEKNKRLITADQREYQQELKKNYNKLKENL 
IPMI ERKI PELYKP I FRVESQKRDS FHRS S FRKCETQLSQGS 
'I*WLQTGKKK — 
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SEQ " 
ID 
WO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cyeteine, D=Aspartic Acid, B» 
Glutamic Acid, F» Phenyl alanine, G«Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Me thionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *»Stop 
Codon, /^possible nucleotide deletion, . 
\«possible nucleotide insertion) 


7108 


1 


942 


VKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNSSTFYIAF 
FLGR FTGHPGAYLRL INRWRLEECH PSGCL I DL CMQMG 1 1 MVLK 
QTWNNFMELGY PL IQNWWTRRKVRQEHGPER KI S FPQWE KD YNL 
Q PMNAYGLFDE YLEM I LQFG FTT I F VAAFPLAPLIiALLNNI I E X 
RLDAYKFVTQWRRPLASRAKDIGIWYGILEG1GILSVITNAFVI 
AITSDFI PRLVYAYKYGPCAGQGEAGQKCMVGY VNASLS VFRI S 
DFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQF 
WKVLAW 


7109 


964 


102 


WDQRKRNS LVPGPAHGPAQEEPWE KKESLGAAQEALS I QLQPKE 
TQPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 
SEKLATDTSTFEATSEGTLELQQRNPKAERLRWS PAQEES FRQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGBKPYKC 
SDCGKTFKQS S NLG Q HQR I HTGEKP FE CNECG KAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 
KAYGWCSELIRHRRVHARKEPSH 


7110 


96 


697 


RLDN FSG FLVE VTKEE RH I VKPLYDR YRLVKQMLTRAS I T PVLG ' 
S PSTKRRGQMLQP I 1 EGETAHFFEE I KEEEEDGVNLS SELGDML 
KTAVQVQSSLKNSESDVEEMQEKLALDLRLSSSRAASMPELLEQ 
LWKARAEKKKLRKTIiREFEE AFYQQNGRNAQKEDRVPVL E EYRE 
YKKIKAKLRLLEVLISKQDSSKS1 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALELNELTAE 
LKRSLPSTDTRLRPDQRYLEEGN1QAAEAQKRRIEQLQRDRRKV 
MEENN I VHQAR F FRRQTDS SG KEW W VTMNTY WRLRAE PG YGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRtlGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW 
FKNDQD I QLSEHFS VKVBQAK YVSMT I KG VTS EDSG K YS INI KN 
KYGGEKIDVTVS VYKHGEKI PDMAPPQQAKPKLI PASASAAGQ 


7113 


1 


824 


KC LRQAWHEAPS S LAFTRWCS REE RAEGGGNLHRS I TRDPKP PG 
LRPSQRPMDDKKKKRSPKPCLAQPAQAPGTLRRVPVFTSHSGSL 
ALGLPHLPS P KQRAKFKRVGKE KGRP VLAGGGSGS AGTPLQH S F 
IiTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKLAR LHFSLD VCGEEEDDEEEEDG VTEGLPEEQKKIWADRNIi 
DQLLS NLG S CLGALVPGGMRGGEGT YSQSHSWALG E KVG VHG S K 
SSGPLNLPRR 


7114 


3 


14 92 


VWEVDEQ I DHYKES QDKFLWQAAFIGKE TL KD ESG QECK1 CR KI 
I YLNTD F VS VKQRLPKY YS WERCS KHHLNFLGQNRS YVR KKBDG 
CKAYW KVCLH YNLHKAQ PAERF FDPNQRGKALHQKQALR KSQRS 
QTOEKLYKCrTECGKVFlQKANLVVHQRTHTOEKPYECCECAKAF 
SQKSTLIAHQRTHTGEKPYECSEOGKTFIQKSTLIKHQRTHTGE 
KPFVCDKCPKAPKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RIHTSEKPQCSEHGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRGKSHLSVHQRIHTGEKPYECSICGKTFSGKSHLSVHHRTHTG 
EKP YECRRCGKA FGEKSTL I VHQRMHTGEKP YKCNE CG KAFS E K 

SPLIKHQRIHTGERPYECTDCKKAFSRKSTLIKHQRIHTGEKPY 
KCSECGKAFSVKlSTLTVHHT?TMTf!:T3*K"PVP*r , onr , r;vA dcpvcti -r 

AVV "*-'*- ,v ' v '* v ** r «^ » lux vxviK J. n luijur 1 &v»XvlJv»^i\AJrSaljlvSTliI 

KHQRSHTGDKNL 


7115 


1 


947 


NAAHGYNWGLWCMYI IPPQDWLDRGDESAPIRTPAMIGCSFWD " 
RE YFGD I GLLD PGME VYGGENVKLGMRVWQCGGSME VL PCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPGVDFGDVSERLALRQRLKCRSFKWYLENVYPEMRVYNNTLT 
YGEVRNSKASAYCLDQGAEDGDRAILYPCHGMSSQLVRYSADGL 
LQLGPLGSTAFLPDSKCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSGP I VSRATGRCLEVEMS KDANFGLRLWQRCSGQKWM I RN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT 
PGS VI NNLS INTVRE VDHLRDRNSGSS S S LNTTLPS T5 AWSS IR 
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SEQ 
ID 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
xre s i due or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, (^Glycine, 
H=Histidine, I=Isoleucine, KcLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possable nucleotide insertion) 










ASNYNVPLSSTAQSTSARNSDSKLTWSPGSVTNTSLAHELWKVP 
LPPKNITAPSRPPPGLTGQKPPLSTWDNSPLRIGGGWGNSDARY 
TPGSSWGBSSSGRITNWIjVIiKNIiTPQlDGSTLRTLCMQHGPLIT 
FHLNLPHGNAIA/RYS S XEE WKAQKSLHI SDLFXLTL 




7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGALPQGA 
FVSQAARAI PLLQPSQAAQAEGLSQPARACGAIiCSLPWPLRNWG 
S P I LRLPGG LRTPTNDRKTRTRS AMACWARAQ WDTLG P L KLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 




7118 


49 


1863 


PHC E PN PGAG AM VLLHVLFEHAVG YALIiALKE VEE I S L"LQPQVE " 
ESVLNLGKFHS I VR LVA FCPFAS SQ VALENANAVS EG WHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRG VRI^FHNL VKGLTDLS ACKAQLGLGHS YSRAKVKFNVNRVD 
NMI I QS IS LLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRIiAQF I GNRRELNEDKLE KLE EIjTMDGAKAKAI LDAS RS S MG 
MD I S A I DI> IN I ES FSSRWS LiS E YRQS LHT YLRS KMS Q VAPS I>S 
AIj I GEAVGAJRIj I AHAGS IjTNIiAKY PAS TVQ I LG AE KALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLAWKCSIASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKBAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRIiAALALASSENSSSTPEECE 
EMS E KPKKKKKQ KPQE VPQENGMEDPS I S FS KP KKKKS FS KEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 




7119 


49 


1863 


PHCEPNPGAGAMVIiliH VLFEHAVG YALLALKEVE EISLLQPQVE 
ESVLNLGKFHS I VRLVAFCPFASSQVALENANAVSEGVVHEDLR 
LTjLE THLPSKKKKVIjLG VGDPK IGAAIQE ELG YNCQTGGV I AE I 
LRGVRLHFHNLVXGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMI IQSISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MD I S AI DL IN I E S FS S RWS LSE YRQS LHTYLRS KMSQVAPS US 
AIi I GEAVGARL I AHAGSIiTNLAKY PAS TVQ I LGAEKAL FRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTS VFGE KLREQ VEERLS FYETGE I PRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRIAALAIASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 




7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFIjVTNLE PRFI E PQTANXjSWFKDS 
NSTTPIi I FVhS PGTDPAADL YKFAEEMKFSKKLSAISIjGQGQGP 
RAEAMMRSSIERGKWVFFQNCHLAPSWMPALERLIEHINPDKVH 

rdfrlwltslpsnkfpvs ilqngskmtiepprgvranllks yss 
lgedflnschkvmefkslllslclfhgnalerrkfgplgfn 1 P Y 
eftdgdlricisqlkmfldeyddipykvlkytageinyggrvtd 

DWDRRCIMNIIiEDFYNPDVLSPEHSYSASGIYHQIPPTYDLHGY 

LSYIKSLiPLNDMPRTFnT.WTW2lMTTPP?Ar>MTr'T«c , »\T t r*T>-r i-rs-r s>> 

**** *■ ^ i-ustLJi i ir c* a. c ojjniJivj>\i>i x x r x\\£ri c, i r/u_ijjGxI IQLQPK 
SSSAGSQGREEIVEDVTQNIIjIiKVPEPINLQP^VMAKYPVLYEES 

mntvlvqevirynrllqvitqtlqdllkalkglvvmssqlelma 
aslyi^ntvpelwsakaypslkplsswvmdllqrldflqawiqdg 
ipavfwisgfffpqafltgtlqnfarkfvisidtisfdfkvmfe 
apseltqrpqvgcyihglflegarwdpeafqlaesqpkelytem 
aviwllptpnrkaqdqdfylcpiyktltragtlsttghstnyvi 
ave 1 pthqpqrh w i krgvai, i caldy 




7121 


2 


546 


rplrpwvlslgsmvglmtygrrqfqsldttmrrli ppfreasak 
lttlvdadaeaftayleamrlpkntpeekdrrtaalqeglrrav 

S VPLTLAET VAS LW P ALQEIiARCGNLACRSDIjQVAAKALEMG VF 

gayf^nvlinuuditdeafkdoihhrvssllqeaktqaalvldcl 
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SECT" 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first . 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

co rre sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Pa Phenyl al a nine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
SsSerine, T=Threonine, V-Valine, 
WaTryptophan, Y«Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
ETRQE 


7122 


2 


546 


R PLR PWVL»i> I»G SM VGLMT YGR RQ FQS LDTTMRRLI P PPR EAS AK 
LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 
SVPLTLAETVASLWPALQELARCGNIACRSDLQVAAKALEMGVF 
GAYFNVL INLRD I TD EAFKD Q I HHRVS SLLQ EAKTQAAIjVL>DCL 
ETRQE 




1 


1092 


KPAVPETVRSAGTSE^GRSGAEEVSCGSVSGDGAAMRLTPRALCS 
AAQAAWRENFPLCGRDVARWFPGHMAKGLKKMQSSLKLVDCIIE 
VHDARI PLSGRNPLFQETLGLKPHLLVLNKMDIADLTEQQKIMQ 
HIiEGEGIiKNVIFTNCVKDENVKQI I PMVTELIGRSHRYHRKENL 
BYCIMVIGVPNVGKSSLINSLRRQHLRKGKATRVGGEPGITRAV 
M S KI QVS ER PLMFLLDTPG VLAPR I E S VETGLKLAI/CGTVLDHLi 
VGEETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVTiKSVAV 

KLGKTQKVKVLTGTGNVNVIQPNYPAAARDFLQTFRRGLLGSVM 
JUDLDVLRGHPRV 


7124 


2 


382 


IiPliTl#IjtjAAPFAHLI*LPPGHDQSPCWHPGPALS PGTLGPLSWAM 
ANSGLQLLGYFLALGGWVGI IASTALPQWKQS S YAGDAS I QLRS 
KVFVLES EWG G DS LGLPRDCG WS CJLLHS AVRS E KG FWS 


712S 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE ' 
FIELRKWIjKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMIIS 
LPESCLLT\RDTVIRSYLGAYITKWKPPPSPLLALCTFLVSEKH 
AGHR S LIjEA\ Y I»E I LPKA YTCPVCLE PE VVNLLP KSLKAXAEEQ 
RAHVQEFFASSRDFFSS LQPLFAEAV0S IFS YS AI>LWAWCTVNT 
RAVYL \SPGSGNAFLQSRTPVQLAP YLDLLNHS PHVQVKAAFNE 
ETHS YE IRTTSRWRKHEEVFI C YGPHDNQRL FLE YGFVS VHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFI VPS PARRCSQKGS LGHI*PTQPWLWAAMS PRGQERGT 
SHSQARE PQR PGRWLI/SSIjQ S S PGTLG QAGTASRRRGCM VQR WV 
Q VATGRRAVQ VP KGALGLALGETS PGASRGMSGGAGGCWALGWA 
PSPVLPSWLLEGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR*IEKRPFKEI*RRIPRIF 
AKQKQI * S*NSQKIGASEIDRGRKEADCSDAPAAARIGAVSVFR ' 
RSTQKARVSPRSNAKSANLRAVRAD*WEHFVLl»FHTPEQFIiAEC 
ICRST**K*WHQLC*PI,SSL*TGI,KRKLLL*VLFRI*WLKDCDV 
* FCQKI FATNFCNWQNLIQ * EE * KPVEYSVEN* H IMNLLLPM * L 
CQSSLRDQTIVTWRM*RNYSMFRINM1SSL* DCS IHI PLKLHFY 
PAL I FTLT VP INS CCQRPLPLFAHQS I KTLAS S GS PMLACLRFL 
LVKKRAFIHTPRSPGCSV*CKHVIiVKDNKNNCVGSEV i 


7128 


2 


5228 


GRVDLWTILLGRSALREIiSQIEAELNKHWRRtjLEGLSYYKPPSP 
SSAEKVKANKDVASPLKELGLRISKFLGLDEEQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LTYFQDE RHPYRVE YADCVDKLEKEL VS KYRQQFEE L YKTEAPT 
WETHGNXiMTEROVSRMFVOCLnPDC'MT.T t?t tt?t w»vct^mi7»T.c.-rv 

LLVLTKMFKEQGFGSRQTNRHLVDETMDPFVDRIGYFSAIjILVB 
GMDIESLHKCAIiDDRRELHQFAQDGL I CQDMDCLMIjTFGD I PHH 
APVLrAWALLRHTLNPEETSSWRKIGGTAIQLNVFQYLTRLLQ 
SLASGGNDCTTSTACMCVYGLLS FVLTSLEbHTLGNQQDI IDTA 
CEVIADPSLPELFWGTEPTSGLGIILDSVCGMFPHLLSPLLQLL 
RAL VSGKSTAKK V YS FLDKMS F YNEL YKffiCPHD VISHEDGTLWR 
RQTPKLLYPLGGQTNLRIPQGTVGQVMLDDRAYLVRWEYSYSSW 
TLFTCE I EMLlaHVVSTADVIQHCQRVKP I IDLVHKVI STDLSXA 
DCLJjPITSRIYMIiLQRLTTVlSPPVDVIA5CVNCLTVlAARNPA 
KVWTDLRHTG FL PFVAH P VS SLSQMIS AEGMNAGG YGNLLMNS E 
3PQGEYGVTIAFLRLlTTLVKGQLGSTQSQGL.VPCVMFVIiKEML 
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SEQ 
ID 
NO: 


Prcdi c ted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide H 
(A=Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, j 
P^Proline, Q=Glutamine, R=Arginine, j 
S^Serine, T^Threonine , V= Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, j 
\=possible nucleotide insertion) 


7159 






PSYHKWRYNSHGVREQIGCLILELIimil^LCHETDLHSSHTPsH 

LQFIiCICSIAYTEAGQTVINIMGIGVDTIDMVMAAQPRSDGAEG 

QGQG Q1>L I KTVKLAFS VTNNV I RLKP PS NWS PLEQ AL SQHGAH 

GNNLIAVLAKYIYHKHDPALPRLAIQLLKRLATVAPMSVYACLG 

NDAAAI RDAFLTRLQS K\ I E \ DMR I K\ VM I L \ E FLT VA\ VETQ P 

GLI ELFLNLE VKDG\SDGS KEFSLGMW\ S CLHAV / VWEL I DSQQ 

QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 

SPLFGTLSPPSETSEPSILETCALIMKIICLEIYYWKGSLDQP 

IiKDTLKKFSIEKRFAYWSGYVKSIiAVHVAETEGSSCTSHiEYQM 

LVSAWRMLLIIATTHADIMHLTDSVVRRQLFLDVLDGTKALLLV 

P AS VNCLRLGS MKCTIjLL I LLRQWKRELGS VDE I LG P I/TE I LEG 

VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDIPQYSQLVLNV 

CBTLQE EV I AL FDQTRHS LALGS ATE D KDSME TDDCS R S RHRDQ 

RDGVCVLGLHLAKELCEVDEDGDSWLQVTRRI,PILPTIjLTTLEV 

slrmkqnlhfteatlhllltlartqqgatavagagitqslclpii 
lsvyqlstngtaqtpsasrksldapswpgvyrlsmslmeqllkt 
lrynflpealdfvgvhqertlqclnavrtvqslacleeadhtvg 

FILQIiSNFMKEWHFHLPQLMRDIQVNLGYLCQACTSFLHSRICML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASEQQALHTVQYGLIiKILSKTLAALRHFTPDVCQILLDQSLDLA 
EYNFLFALSFTTPTFDSEVAPSFGTIiLATVNVALNMLGELDKICK 
E PLTQAVGLS TQAEGTRTJjKS IiLMFTMENC F YLL. I S QAMR YLRD 
' PAVHPRDKQRMKQELS SELSTLLSSLSRYFRRGAPS S PATGVLP 
SPQGKSTSLSKASPESQEPLIQIjVQAFVRHMQR I 




1 


1054 


FRRFRWRRRLH *AGPASSAGGSPGEASGTM£6£LPPNINTKEPR 
WDQSTFIGRANHFFTVTDPRNIIjLTNEQIiESARKIVHDYRQGIV 

ppgltenelwrakyiydsafhpdtgekmiligrmsaqvpmnmti 
tgcmmtfyrttpavlfwqwinqsfnawnytnrsgdapltvnel 
gtayvsattgavatalglnaltkhvspligrfvpfaavaaanci 
niplmrqrelkvgipvtdengnrlgesanaakqaitqvwsril 
maapgmaippfimnti»ekkaflkrfpwmsapiqvglvgfclvfa 
tplccalfpqkssmsvtsleaelqakiqeshpelrrvyfnkgl ! 


7130 


2 


780 f 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHIiYNSL^R^Gj 
ISAKSQPYHRSQSSSSVLINKSMDSINYPSDVGKQQLIiSLHRSS 
RCES HQDLLPD1 ADSHQQGTE KLS DLTLQDS QKWWNRNL PLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DS KF VDAD FSDNVCSGNTLHS LNS PRTP KKP VNS KLGhS P YI/TP 
YNDSDKLNDYLWRGPS PNQQNI VQS LREKFQCLSSS S FA | 


7131 


805 


573 | 


AAAEGH I EWKFL I EACKVNP FAKDR WGNT PLDDAVQFNHIjE W j 
KLIiQD YQDS YTLS ETQAEAAAEALSKENLE SMV j 


7132 
7133 


1420 


1087 


I DMLLLSGAliVSG P YTL ITTAVS ADLGTHKS IiKGNAHALS TVTA ' 

IIDGTGSVGAALGPLLAGLLSPSGWSNVFYMLMFADACAliLFLI 
RLIHKELSCPGSATGDQVPFKEQ 




2 


3648 I 
I 

| 1 


QOIPGLLPAHGESGDALRKPRIiQKPITGHDDDLFFTLYPSLBkFH 
EEELLELHVQDHFQEGCGPLDGGALEILERRLRVGVHNGLGFVQ 
RPQWVLVPEMDVALrRSASFSRJCWSSSKTSSGSQAI.VLRSRL 
RLPEMVGHPAFAVI FQLE YVFSS PAGVDGNAA<3 VTQ t .qktt nnvm 1 

MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVIAAPQNSPVGPGLSISQLAASPRSPTQHCL 
ARPTSQLPHGSQASPAQAQEFPLEAGISHLEADLSQTSIiVLETS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
B I LDANKQ PAEAV5 ATEP VTFNPQKE E SDCLQSNEMVLQ FLAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGAIiTHILVPVSRDGTFnAGSPGFQLRYMVGPGFliKPGERRCFA 
* YLAVQTLQI D VWDGDS LLL I GSAAVQMKHLLRQGR PAVQASHE 
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ID 

NO: 



7134 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



2115 



1111 



7135 



7136 



7138 



7139 
7140" 



2072 



418 



466 



466 



357 



1957 



Amino acid segment containing signal peptide 
<A«Alanine, (^Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, OGlutamine, R^Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\^posaible nucleotide inser tion) 
tifci V VATS YEQDNM WSGDMLGFGRV KPIGVHS WKGRLHLTLAN 
VGHPCEQKVRGCSTliPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHWQAQKLADVDSELAAMLLTHARQGKGPQDVSRESDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDI^QVIAAYR 
ERTKAES I AS L LS LAI TTEHTLHATLG VABFFEFVLKNPHNTQH 
TVTVEIDNPELSVIVDSQEWRDPKGAAGLHTPVEEDMFHLRGSL 
APQLYLRPHETAHVPFKFQSFSAGQLAMVQASPGLSNEKGMDAV 
S PWKS SAVPTKHAKVIiFRASGGKP I AVLCLTVELQPHVVDQ VFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFIiKVASGPSPEI KDFFVI IYSDRWLATPT 
QTWQ VYLH SLQRVD VS CVAGQLTRLS L VXiRGTQTVRKVRAFTSH 
PQELKTDPKGVFVLPPRGVQDLHVGVRPLRAGSRFVHLNLVDVD 
CHQLVASWLVCLCCRQPLISKAFEIMLAAGBGKGVNKRITYTNP 
YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGLQFAPSQRV 
GEEEILIYINDHEDKNBEAFCVKV IYQ 

UVjlSLj t'S Y P FHVGIiS LGTPLDPHY VLIjEVHYDN PTYEEGli idnsg 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTIPPGMPEFQSEGHCTL 
ECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKI, 
LAYDDDFDFNFQEFQYLKEEQT1LPGDNLITECRYNTKDRAEMT 
WGGLSTRSEMCLSYLLYYPRINLTRCASIPDIMEQLQFIGVKEI 
YR P VT T W P FI I KS P KQ YKNIiS FMDAMNKFKWT KKEGLS FNKLVIi 

SLPVNVRCSKTDNAEWSIQGMTALPPDIERPYKAEPLVCGTSSS 
SSLHRPFS INLLVCIiLLLS CTLSTKSL 

FVPRVTPRSLSLQGPKGB SVGSITQPLP^^yLlFRAASESDGRC 
WLDALEbALRCSSLLRLGTCKPGRDGE PGTS PDAS PSSLCGLPA 

SATVHPDQDLFPLNTGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAP VRRGTTYVEQVQEELGELGEAS QVE TVSE 
ENKS LMWTLLKQIiR PGMDLSRVVLPT F VLEPRS FIiNKLS D Y YYH 
ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 
GS I TAKS R FYGNSLS ALLDG KATLTFLNRAED YTLTM P YAHCKG 
ILYGTMTLEI^GKVTrECAKNNFQAQLEFKLKPFFGGSTSINQI 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTTPSGEVRR 
QRLRQHTVPLEEQTELESERLWQHVTRAISKGDQHRATQEKFAL 
EEAQRQRARERQESLMPWKPQLFHLDPITQEWHYRYEDHSPWDP 
LKDIAQFEQDGILRTLQQEAVARQTTFLGSPGPRHERSGPDQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRIjQALHEAI ls ireaqqelhrhlsamls staraaqa 

PTPGLLQS PRSWFXiLCVFlACQLFl NHlLK 
D F VP S FRR P ^ GNTS QTVW LLRAATLE KE VAGIjRE kihhlddmlk 

S QQR kvrqmieqlqns kavi qs kdati qelkek iayleaenlem 

HDRMEHLIBKQISHGNFSTQARAKTENPGSIRISKPPSPKPMPV 
IRWET 

WASGMSTVPGGSRHSLGIQVRGGWG VTGGEEESLTVPV7ADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFIiTELQRLDSAI 

PDDLDGNTNKRSKEVRVLQEriQLLQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKESYKLSCQLE PENP 

WASGMSTVPGGSRHSLGIQVRGGWGV TGGEEESLTVPVADTWQA 

GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAr 

PDDUXSNTWKRSKEVRVLQEMQLLQVAAMNYRiRPLEKFVTYFT 
KMEQLSDKES YKLS CQLE PENP 

S LRNSARGliKMAASAARGAAALRRS I NQ P VAF VRR I P WTAAS SO 
1*KEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGIjRNALO 
QENHI IDGVKVQVHTRRPKLPQTSDDEKKDF 
j RASSLQ VLKAWUGli I PS S FQQQHTGQ YALKKJUFDLKVYDCFCS F 
' NMNVS LEKQLRPSQ PWPRGKCRKTPGWEEARP KAQDLRGDIjGKT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
J nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, Es= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
n"ni8txumc r x— looicucinc , -K=ijysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y^Tyrosine, X=*Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








QAGPAEARTRGPPRLPAATGCPPHIiPGLLSGlSVDIDPTGLQSQ 
ii* odr/i yA-oiJijay XiijjGLd^KLRKYVVGELI WNF 

ADPMTNQCG 


714 X 
7142 


124 


1073 


bDSRSCWLDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDITVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEEILD 
^ ^J-»"^vijniy w*Jay UKfcbAGEG JbGPRR VKPSPRRET FVLKDS P 
VRDLLPTVNSLTRSTPS /LKQPDASTPE * * *EGVSQGS PGYI WK 
EALQHEEGVTHLQS VPCIQKPS I FSS\SRSTPPVRGRAGPSGRA 
AASEETRAAKLRGAAAKSSCQLP IPSAI PRPASRMPLTSRS VPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7143 


658 


839 


jja rui innmj!,ijiu»ijji3i> v l IjHIKAKJjX Wi<JLiKPTSCI»IFQNVLNLL 
KK*SRAVG\A7WMCRT/YSSDLQVGVIKPWLLLGSQDAAHDLDT 
LKKNKVTHILNVAYGVENAFLSDFTYKSISILDLPETNILSYFP 
ECFEF IE EAKRKDG WLVHCNA 


7144 


3 


773 


aijEMSSDGEPLiSRMDSEDSlSSTIMDVDSTISSGRSTPAMMNGQ 
GS TTS S S KN I AYNCC WDQCQACFNSS PDLADH I RS I HVDGQRGG 

VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
SFASQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSKAGMNKRR 

KLKNKRRRSLARPHDFFDAQTLDAIRHRAICFNLSAHIESLGKG 
HS WFHS TVS I LLFFQIK YKTliQKNIST I ISKS hK I 




1 


988 


i?'K V1S1MQDGGPS PAEHS KAEES AGME ARFLGLPDAAGS S GPTPAR 
RCPAPRPAGVS YVIRDE VEKYNRNGVNALQLDPAliNRLFTAGRD 
SIIRIWSVNQHKQDPYIASMEHHTDWVNDIVLCCNGKTLISASS 
DTT VKVWNAHKGFCMSTLRTHKD YVKAIaAYAKDKE LVAS AGLDR 
QIFLWDVNTLTALTASNNTVTTSSLSGNKDSIYSLAMNQLGTII 
VSGSTEKVLRVWDPRTCAKLMKLKGHTDNVKALLLNRDGTQCLS 
GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 

rdrkiyctdlrnpdirvlice 



TRADOCS: 14 1 6260. 1(%CSK01 !.DOC) 
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WHAT IS CLAIMED IS: 
1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:l-1786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 1 0. 

• 13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 1 0 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
- cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 

complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1 786 and 3573-5358, an active 
domain of SEQ ID NO:I-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-1 786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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which is a clearly different methods than the methods in the other Groups.Thus, in summary, each of Groups MV are directed to 
different special technical features and thus support this lack of unity. 

Additionally, each of the claims is directed to more than one species of the generic invention. These species are deemed to lack unity 
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